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14 41 TTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATTCft . 1480 

1481 C TGGT GG AG ACCT CGTTAG ACTCAAC AGCAGTGGAAATAA 1520 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 1560 

1561 CCATCCACATCTACCAGATATAGAGTTCGl'GTGAGGTATG 1600 

1601 CTTCTGTGACCCCTATTCACCTCAACGTTAATTGGGGTAA 1640 

1641 TTCATCCATCTTCTCCAATACAGTTCCAGCTACAGCTACC 1680 

1681 TCCTTGGATAATCTCCAATCCAGCGATTTCGGTTACTTTG 1720 

1721 AAAGTGCCAATGCTTTTACATCTTCACTCGGTAACATCGT 17 60 

17 61 GGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGArrATC 1800 

1801 GACAGAT T CG AGTTCATTC CAGTTACTGC AACACTCGAGG 1840 

1841 CTGAGTACAACCTTGAGAGAGCCCAGAAGGCTGTGAACGC ' 1880 

1881 CCTCTTTACCTCCACCAATCAGCTTGGCTTGAAAACTAAC 1920 

1921 GTTACTGACTATCACATTGACCAAGTGTCCAACTTGGTCA 1960 

1961 CCTACCTTAGCGATGAGTTCTGCCTCGACGAGAAGCGTGA 2C00 

2001 ACTCTCCGAGAAAGTTAAACACGCCAAGCGTCTCAGCGAC 2040 

2041 GAGAGGAATCTCTTGCAAGACTCCAACTTCAAAGACATCA 2080 

2081 ACAGGCAGCCAGAACGTGGTTGGGGTGGAAGCACCGGGAT 2120 

2121 C AC GAT C C AAGG AGGCG AC GATGTG TTCAAGGAG AACTAC 2160 

2161 GTCACCCTCTCCGGAACTTTCGACGAGTGCTACCCTACCT 2200 
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2201 ACTTGTACCAGAAGATCGATGAGTCCAAACTCAAAGCCTT 2240 

'22 41 C AC C AG G T AT C AAC TT AG A G G C TAG AT C G AAG AC AG C C AA 22S0 

2281 GACCTTGAAATCTACTCGA7CAGGTACAATGCCAAGCACG 2320 

2321 AGACCGTGAATGTCCCAGGTACTGGTTCCCTCTGGCCACT 2360 

23 SI TTCTGCCCAATCTCCCATTGGGAAGTGTGGAGAGCCTAAC 2400 

24 01 AGATGCGCTCCACACCTTGAGTGGAATCCTGACTTGGACT 2440 

2441 GCTCCTGCAGGGATGGCGAGAAGTGTGCCCACCATTCTCA 2480 

2481 TCACTTCTCCTTGGACATCGATGTGGGATGTACTGACCTG 2520 

2521 AATGAGGACCTCGGAGTCTGGGTCATCTTCAAGATCAAGA 2560 

2561 CCCAAGACGGACACGCAAGACTTGGCAACCTT'GAGTTTCT 2600 

2601 CGAAGAGAAACCATTGGTCGGTGAAGCTCTCGCTCGTGTG 2640 

2 641 AAGAGAGCAGAGAAGAAGTGGAGGGACAAACGTGAGAAAC 2680 

2 681 TCGAATGGGAAACTAACATCGTTTACAAGGAGGCCAAAGA 2720 

2721 GTCCGTGGATGCTTTGTTCGTGAACTCCCAATATGATCAG 2760 

27 61 TTGCAAGCCGACACCAACATCGCCATGATCCACGCCGCAG 2800 

2801 ACAAACGTGTGCACAGCATTCGTGAGGCTTACTTGCCTGA 2840 

2841 GTTGTCCGTGATCCCTGGTGTGAACGCTGCCATCTTCGAG' 2880 

2881 GAACTTGAGGGACGTATCTTTACCGCATTCTCCTTGTACG 2920 

2 921 ATGCCAGAAACGTCATCAAGAACGGTGACTTCAACAATGG 2960 
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2 9 61 CCTCAGCTGCTGGAATGTGAAAGGTCATGTGGACGTGGAG 3000 

3001 GAACAGAACAATCAGCGTTCCGTCCTGGTTGTGCCTGAGT 3040 

3041 GGGAAGCTGAAGTGTCCCAAGAGGTTAGAGTCTGTCCAGG 3080 

3081 TAGAGGCTACATTCTCCGTGTGACCGCTTACAAGGAGGGA 3120 

3121 TACGGTGAGGGTTGCGTGACCATCCACSAGATCGAGAACA 3160 

3151 ACACCGACGAGCTTAAGTTCTCCAACTGCGTCGAGGAAGA 3200 

3201 AATCT ATC CC AAC AACACC GTT ACTTGCAACGACTACACT 3240 

3241 GTGAATCAGGAAGAGTACGGAGGTGCCTACACTAGCCGTA 3280 

3281 ACAGAGGTTACAACGAAGCTCCTTCCGTTCCTGCTGACTA 3320 

3321 TGCCTCCGTGTACGAGGAGAAATCCTACACAGATGGCAGA 3360 

3351 CGTGAGAACCCTTGCGAGTTCAACAGAGGTTACAGGGACT 3400 

3401 ACACACCACTTCCAGTTGGCTATGTTACCAAGGAGC7TGA 3440 

34 41 GTACTTTCCTGAGACCGACAAAGTGTGGATCGAGATCGGT 3480 
3481 GAAACCGAGGGAACC7TCATCG7GGACAGCGTGGAGCTTC 3520 

35 21 TCTTGATGGAGGAA 3534. 

einem Struktur-Gen, das fur ein insektizides Protein von B.t.t. codiert, mit der Sequenz: 

1 ATG ACTG C AGAC AAC AAC AC C G AAGCC CT CG AC AGTT CTA 40 

41 CCACTAAGGATGTTATCCAGAAGGGTATCTCCGTTGTGGG 80 

117 
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81 AGACCTCTuGGGCGTGGTTGGATTTCCCTTCGGTGGAGCC 120 

121 CTCGTGAGCTTCTATACAAACTTTCTCAACACCATTTGGG 150 

161 CAAGCGAGGACCCTTGGAAAGCATTCATGGAGCAAGTTGA 200 

201 AGCTCTTATGGATCAGAAGATTGCAGATTATGCCAAGAAC 240 

241 AAGGCTTTGGCAGAACTCCAGGGCCTTCAGAACAATGTGG 280 

281 AGGACTACGTGAGTGCATTGTCCAGCTGGCAGAAGAACCC 320 

321 TGTTAGCTCCAGAAATCCTCACAGCCAAGGTAGGATCAGA 360 

361 GAGTTGTTCTCTCAAGCCGAATCCCACTTCAGAAATTCCA 400 

401 TGCCTAGCTTTGCTATCTCCGGTTACGAGGTTCTTTTCCT 440 

441 CACTACCTATGCTCAAGCTGCCAACACCCACTTGXTTCTC 480 

481 CTTAAGG ACGCT C AAAT C7ATGGAGAAGAGTGGGGATACG 520 

521 AGAAAGAGGACATTGCTGAGTTCTACAAGCGTCAACTTAA 560 

561 GCTCACCCAAGAGTACACTGACCATTGCGTGAAATGGTAT 500 

601 AACGTTGGTCTCGATAAGCTCAGAGGCTCTTCCTACGAGT. . 640 

641 CTTGGGTGAACTTCAACAGATACAGGAGAGAGATGACCTT 680 

681 GACTGTGCTCGATCTTATCGCACTCTTTCCCTTGTACGAT 720 

721 GTGAGACTCTACCCAAAGGAAGTGAAAACTGAGCTTACCA 7 60 

7 61 GAGACGTGCTCACTGACCCTATTGTCGGAGTCAACAACCT 800 

801 TAGGGGTTATGGAACTACCTTCAGCAATATCGAAAACTAC 84 0 
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841 ATTAGGAAACCACATCTCTTCGACTATCTTCACAGAATTC 880 

881 AATTCCACACAAGGTTTCAACCAGGATACTATGGTAACGA 920 

921 CTCCTTCAACTATTGGTCCGGTAACTATGTTTCCACCAGA 960 

9 61 CCAAGCATTGGATCTAATGACATCATCACATCTCCCTTCT 1000 

1001 ATGGTAACAAGTCCAGTGAACCTGTGCAGAACCTTGAGTT 1040 

1041 CAACGGCGAGAAAGTCTATAGAGCCGTCGCAAACACCAAT 1080 

1081 CTCGCTGTGTGGCCATCCGCAGTTTACTCAGGCGTCACAA 1120 

1121 AGGTGGAGTTTAGTCAGTATAACGATCAGACCGATGAGGC 1160 

1161 CAGCACCCAGACTTACGACTCCAAACGTAACGTTGGCGCA 1200 

1201 GTCTCT7GGGATTCTATCGACCAATTGCCTCCAGAAACCA 1240 

1241 CAGACGAACCATTGGAGAAGGGCTACAGCCACCAACTTAA 12 80 

1281 CTATGTGAXGTGCTTCTTGATGCAAGGTTCCAGAGGGACC 1320 

1321 ATTCCAGTGTTGACCTGGACACACAAGTCCGTGGACTTCT 13 60 

1361 TCAACATGATCG ATAGCAAGAAGATCACTCAACTTCCCTT 14 00 

14 01 GGT5AAAGCCTACAAGCTGCAATCTGGTGCTTCCGTTGTC 14 40 

14 41 GCAGGTCCCAGATTCACTGGAGGTGACATCATCCAGTGCA 1480 

14 81 CAGAGAACGGCAGCGCAGCTACTATCTACGTGACACCTGA 1520 
1521 TGTGTCTTACTCTCAGAAGTACAGGGCACGTATTCATTAC 1560 

15 61 GCATCTACCAGCCAGATCACCTTCACACTCAGCTTGGATG 1600 
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1601 GAGCACCCTTCAACCAGTATTACTTTGACAAGACCATCAA 164 0 

1641 CAAAGGTGACACTCTCACATACAATAGCTTCAACTTGGCA 1660 

1 S B 1 AGTTTCAGCACACCATTTGAACTCTCAGGCAACAATCTTC 1720 

1721 AGATCGGCGTCAC CGGTCTCAGCGC C GGAGACAAAGT CTA 17 60 
17 61 CATCGACAAGATTGAGTTCATCCCAGTGAAC 17 91. 

I. einem Struktur-Gen, das fur ein insektizides Protein von B. t. entomocidus codiert, mit der Sequenz: 

1 ATGGAGGAGAACAACCAAAACCAATGCATTCCATACAACT 40 

41 GCTTGAGTAACCCAGAAGAGGTATTGC7TGATGGAGAACG SO 

81 CATTTCAACCGG7AACTCTTCCATCGACATCTCCTTGTCC 120 

121 TTGGTCCAGTTTCTGGTCAGCAACTTCGTGCCAGGTGGTG 160 

151 GGTTCCTTGTCGGACTAATTGACTTCGTCTGGGGTATCGT 200 

201 TGGTCCATCTCAATGGGATGCArTCCTGGTGCAAATTGAG . 240 

241 CAGTTGATCAACGAGAGGATCGCTGAGTTCGCCAGGAACG 280 

281 CTGCCATCGCTAACTTGGAAGGATTGGGCAATAACTTCAA 320 

221 CATCTATGTGGAGGCCTTCAAAGAGTGGGAAGAGGACCCT 360 

3 61 AACAACCCAGAGACCCGCACTAGGGTGATCGACAGATTCA 400 • 
401 GAATCTTGGACGGCCTCTTGGAGAGAGATATCCCATCCTT 440 

4 41 CAGAATCTCTGGCTTCGAAGTTCCTCTCTTGTCCGTGTAC 4 80 

120 



EP 0 385 962 B1 

4 81 GCTCAAGCAGCTAATCTTCACCTCGCTATCCTTCGAGACA 520 

521 GTGTCATCTTTGGGGAAAGGTGGGGATTGACCACTATCAA 560 

561 CGT CAAT GAGAAT T ACAACAG ACTT ATCAGGCAC ATT G AC 600 

601 GAGTACGCCGACCACTGTGCTAACACCTACAACCGTGGCT 640 

641 TGAACAATCTCCCTAAGTCTACTTATCAAGATTGGATTAC 680 

681 CTACAACAGGTTGAGGAGAGACTTGACCCTCACAGTTTTG 720 

721 GACATTGCAGCTTTCTTCCCGAACTATGACAACAGGAGAT 760 

7 61 ACCCTATCCAACCAGTGGGTCAACTTACCAGAGAAGTCTA 800 

801 TACTGACCCACTTATCAACxTCAAC CCTCAGTTGCAAAGT 840 

841 GT CGCCC AACTTC CCAC ATTC AACGTCATGG AGTCCAGCC 880 

8S1 GIATCAGGAACCCACACTTGTTTGACATCTTGAACAACCT 920 

'•921 TACTATCTTCACCGArTGGTTCAGCGTTGGGCGTAACTTC 950 

961 TATTGGGGTGGACACAGGGTCATCTCCTCTCTTATTGGAG 1000 

1001 GTGGGAACATTACCTCTCCTATCTATGGACGTGAGGCAAA 1040 

1041 CCAGGAGCCACCACGTAGTTTCACCTTCAACGGTCCAGTC 1080 

10 81 TTCAGAACCTTGTCTAACCCTACCTTGAGATTGCTCCAGC 1120 

1121 AACCTTGGCCAGCTCCACCTTTCAACCTTAGAGGTGTTGA 1160 

1161 GGGCGTTGAGTTCTCTACTCCTACCAACTCCTTCACTTAC 1200 

1201 AGAGGTAGAGGAACCGTTGATTCCTTGACCGAACTCCCAC 1240 
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12 41 CAGAGGACAATAGCGTGCCACCCAGGGAAGGCTACTCCCA 1280 
1281 CAGGTTGTGCCACGCAACCTTCGTGCAGCGTTCCGGAACT 1320 
1321 CCATTCCTCACTACAGGAGTTGTGTTCTCATGGACTGATC 1360 

13 61 GTAGTGCTACTCTCACTAATACCATTGATCCCGAGAGGAT 1400 
1401 CAATCAAATCCCATTGGTCAAGGGTTTCCGTGTGTGGGGA 1440 
1441 GGAACTTCTGTCATCACAGGACCAGGCTTCACAGGAGGTG 1480 
1481 ATATTCTTAGAAGAAACACTTTTGGCGACTTTGTGAGCCT 1520 
1521 CCAAGTIAACATCAACTCTCCAATTACTCAAAGATATCGT 1560 
15 Gl CTCAGGTT? C GTT AC GCATCTT CC CGTGAC GCTAGAGT CA 1600 
1601 TCGTGCTCACCGGAGCAGCTTCTACCGGTGTCGGTGGACA 1640 
1641 AGTCTCCGTGAACATGCCACTCCAGAAGACTATGGAGATC 1680 
1681 GGCGAGAACTTGACATCC AGGACCTTCAGATACACCGACT 1720 
1721 TCTCTAACCCTTTCAGTTTCCGTGCCAACCCTGACATCAT 17 60 

17 51 TGGCATTAGCGAACAACCTCTCTTTGGAGCTGGTAGCATC 1800 

18 01 T C AT CT GGCGAATT GT ACATTG AC AAG ATTG AG ATC ATTC 1840 
1841 TTGCCGACGCTACCTTCGAGGCTGAGTCTGACCTTGAGAG 1880 
1881 AGCCCAGAAGGCTGTGAACGCCCTCTTTACCTCCTCTAAT 1320 
1921 CAGATTGGCTTGAAAACTGACGTTACTGACTATCACATTG I960 
1961 ACCAAGTGTCCAACTTGGTCGACTGCCTTAGCGATGAGTT 2000 
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2 0 01 CTGCCTCGACGAGAAGCG7GAACTCTCCGAGAAAGTTAAA 204 0 

2041 CACGCCAJVGCG7CTCAGCGACGAGAGGAATCTCTTGCAAG 2080 

2081 ACCCCAACTTCAGAGGCATCAACAGGCAGCCAGACCGTGG 2120 

2121 TTGGAGAGGAAGCACCGACATCACCATCCAAGGAGGCGAC 21S0 

2161 GATGTGTTCAAGGA.GAACTACGTCACCCTCCCAGGAACTG 2200 

2201 TGGACGAGTGCTACCCIACCTACTTGTACCAGAAGATCGA 2240 

2241 TGAGTCCAAACTCAAAGCCTACACCAGGTATGAACTTAGA 2280 

2231 GGCTACATCGAAGACAGCCAAGAGCTTGAAATCTACCTCA 2320 

2321 TCAGGTACAATGCCAAGCACGAGATCGTGAATGTCCCAGG 2360 

23 61 TACTGGTTCCCTCTGGCC&CTTTCTGCCCAAATGCCCATT 2400 
2401 GGGAAGTGTGGAGAGCCTAACAGATGCGCTCCACACCTTG 2440 
2441 AGT GG AAT C CTG ACTTGG ACT GCTC CTGCAGGG ATGGC G A 2480 

24 81 GAAGTGTGCCCACCATTCTCATCACTrCACCTTGGACATC 2520 
2521 GAT GTGGG ATGTACTGACCTGAATGAGGAC CTCGGAGTCT 25 60 

25 61 GGGTCATCTTCAAGATCAAGACCCAAGACGGACACGCAAG 2600 
2601 ACTTGGCAACCTTGAGTTTCTCGAAGAGAAACCATTGCTC 2540 
2541 G GT G AAG C TCTC G CTC GTGTG AAG AG AGC AG AG AAG AAGT 2680 
2681 G G AGGG A C AAAC G T GAGAAACT C C AACTC GAG ACTAACAT 2720 
2721 CGTTTACAAGGAGGCCAAAGAGTCCGTGGATGCTTTGTTC 2760 
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27 61 GTGAACTCCCAATATGATAGG7TGCAAGTGGACACCAACA 2800 

2801 TCGCCATGATCCACGCTGCAGACAAACGTGTGCACAGGAT 284 0 

2841 TCGTGAGGCTTACTTGCCTGAGTTGTCCGTGATCCCTGGT 2880 

2881 GTGAACGCTGCCATCTTCGAGGAACTTGAGGGACGTATCT 2920 

2921 TTACCGCATACTCCTTGTACGATGCCAGAAACGTCATCAA 2960 

2961 GAACGGTGACTTCAACAATGGCCTCTTGTGCTGGAATGTG 3000 

3001 AAAGGTCATGTGGACGTGGAGGAACAGAACAATCACCGTT 3040 

3041 CCGTCCTGGTTATCCCTGAGTGGGAAGCTGAAGTGTCCCA 3080 

3081 AGAGGTTAGAGTCTGTCCAGGTAGAGGCTACATTCTCCGT 3120 

3121 GTGACCGCTTACAAGGAGGGATACGGTGAGGGTTGCGTGA 3160 

3161 CCATCCACGAGATCGAGGACAACACCGACGAGCTTAAGTT 3200 

3201 C7CCAACTGCGTCGAGGAAGAAGTCTATCCCAACAACACC 3240 

3241 GTTACTTGCAACAACTACACTGGGACCCAGGAAGAGTACG 3280 

3281 AAGGTACCTACACTAGCCGTAACCAAGGTTACGACGAAGC 3320 

3321 TTACGGAAACAATCCTTCCGTTCCTGCTGACTATGCCTCC 33 60 

33 61 GTGTACGAGGAGAAATCCTACAC AGATGGCAGACGTG AGA 3400 
3401 ACCCTTGCGAGTCCAACAGAGGTTACGGTGACTACACACC 3440 

34 41 ACTTCCAGCAGGCTATGTTACCAAGGACC7TGAGTACTTT 3 480 
34 81 CCTGAGACCGACAAAGTGTGGATCGAGATCGGTGAAACCG 3520 
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35 21 AGGGAACCTTCATCGTGGACAGCG7GGAGCTTCTCTTGAT 3 560 

35 61 GGAGGAA 35 67. 

t 

im Struktur-Gen, das fur ein insektizides P2-Protein codiert, mit der Sequenz: 

1 ATGGACAACAACGTCTTGAACTCTGGTAGAACAACCATCT 40 

41 GCGACGCATACAACGTCGTGGCTCACGATCCATTCAGCTT 80 

81 CGAACACAAGAGCCTCGACACTATTCAGAAGGAGTGGATG 120 

121 GAATGGAAACGTACTGACCACTCTCTCTACGTCGCACCTG ISO 

161 TGGTTGGAACAGTGTCCAGCTTCCTTCTCAAGAAGGTCGG 200 

201 CTCTCTCATCGGAAAACGTATCTTGTCCGAACTCTGGGGT 240 

'241 ATCATCTTTCCATCTGGGTCCACTAATCTCATGCAAGACA 280 

281 T C TTGAGGG AGAC CGAACAGTTTCTCAACC AGC GTCTCAA 320 

321 CACTGATACCTTGGCTAGAGTCAACGCTGAGTTGATCGGT 360 

3 51 CTCCAAGCAAACATTCGTGAGTTCAACCAGCAAGTGGACA 400 

401 AC TTCTT GAATCC AACTC AGAATC CTGTGC CTCTTT CCAT 440 

441 CACTTCTTCCGTGAACACTATGCAGCAACTCTTCCTCAAC 480 

431 AGATTGCCTCAGTTTCAGATTCAAGGCTACCAGTTGCTCC 520 

521 TTCTTCCACTCTTTGCTCAGGCTGCCAACATGCACTTGTG 560 

5 61 CTTCATACGTGACGTGATCCTCAACGCTGACGAATGGGGA 600 
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601 ATCTCTGCAGCCACTCTTAGGACATACAGAGACTACTTGA 64 0 

641 GGAACTACACTCGTGATTACTCCAACTATTGCATCAACAC 680 

681 TTATCAGACTGCCTTTCGTGGACTCAATACTAGGCTTCAC 720 

721 GACATGCTTGAGTTCAGGACCTACATGTTCCTTAACGTGT 760 '• 

761 TTGAGTACGTCAGCATTTGGAGTCTCTTCAAGTACCAGAG 800 

801 CTTGATGGTGTCCTCTGGAGCCAATCTCTACGCCTCTGGC 840 

' 841 AGTGGACCACAGCAAACTCAGAGCTTCACAGCTCAGAACT 880 

881 GGCCATTCTTGTATAGCTTGTTCCAAGTCAACrc'CAACrA 920 

921 CATTCTC AGTGGTATCTC 1CGGKC CAG ACTCT CCATAAC C 360 

SSI TTTCCCAACATTGGTGGACTTCCAGGCrCCACTACAACCC 1000 

1001 ATAGCC7TAACTCTGCCAGAGTGAACTACAGTGGAGGTGT 1040 

1041 CAGCTCTGGATTGATTGGTGCAACTAACTTGAACCACAAC 1080 

10S1 TTCAATTGCTCCACCGTCTTGCCACCTCTGAGCACACCGT 1120 

1121 TTGTGAGGTCCTGGCTTGACAGCGGTACTGATCGCGAAGG 1160 

1161 AGTTGCTACCTCTACAAACTGGCAAACCGAGTCCTTCCAA 1200 

1201 AC C ACTC TTAGC CTTCGGTGTG G AGCTTTC T CTGCACGXG 1240 

1241 GGAATTCAAACTACTTTCCAGACTACTTCATTAGGAACAT 1280 

1281 CTCTGGTGTTCCTCTCGTCATCAGGAATGAAGACCTCACC 1320 

1321 C GTC C ACTTC ATTAC AACC AG ATT AGGAAC ATC G AGTCTC 13 60 
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13 61 CATCCGGTACTCCAGGAGGTGCAAGAGCTTACCTCGTGTC 1400 

14 01 T GTC C ATAAC AGG AAG AAC AAC AT CTAC G CT G C CAAC GAG 14 40 

1441 AATGGCACCATGATTCACCTTGCACCAGAAGATTACACTG 14 80 

1431 GATTCACCATCTCTCCAATCCATGCTACCCAAG7GAACAA 1520 

1521 TCAGACACGCACCTTCATCTCCGAAAAGTTCGGAAATCAA IS SO 

1561 GGTGACTCCTTG AGGTTCGAGCAATCCAACACTACCGCTA 1600 

1601 GGTACACTTTGAGAGGCAATGGAAACAGCTACAACCT7TA 1640 

1641 CTTGAGAGTTAGCTCCATTGGTAACTCCACCATCCGTGTT 1680 

1681 ACCATCAACGGACGTGTTTACACAGTCTCTAATGTGAACA 1720 

1721 CTACAACGAACAATGATGGCGTTAACGACAACGGAGCCAG 1760 

17 61 ATTCAGCGACATCAACATTGGCAACATCGTGGCCTCTGAC 1800 

1801 AACACTAACGTTACTTTGGACATCAATGTGACCCTCAATT 1840 

1841 CTGGAACTCCATT7GATCTCATGAACATCATGTTTGTGCC 1880 
1881 AACTAACCTCCCTCCATTGTAC 1902 

oder 

K. einer Struktur-Gen-Sequenz, die fur esn Fusionsprotein codiert, das die N-terminalen 61 0 Aminosauren von 
B.t.k. HD-1 und die C-terminalen 567 Aminosauren von B.t.k. HD-73 aufweist, welches Gen die Seguenz hat: 

1 ATGGACAACAACCCAAACATCAACGAATGCATTCCATACA 40 
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4 1 ACTGCTTGAGTAACCCAGAAGTTGA^.GTACTTGGTGGAGA 

8 1 ACGCATTGAAACCSGTTACACTCCCArCGACATCTCCTTG 

121 TCCTTGACACAGTTTCTGCTCAGCGAGT'TCGTGCCAGGTG 

1 51 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 

201 CTTTGGTCCATCTCAATGGGATGCATTCCTGGTGCAAATT 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 

231 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 

321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGC CGAT 

3 61 CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 

4 01 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 

"441 GTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCGTG 

481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 

521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 

5 €1 AACCATCAATAGCCGTTACAACGACCTTACTAGGCTGATT 

601 GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 

641 GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGGAT 

681 TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGTT 

7 21 TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAGAA 

7 61 CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT 



128 
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801 CTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCTTC 840 

8 41 CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 880 

881 CACACTTGATGGACATCTTGAACAGCATAACTATCTACAC 920 

921 CGATGCTCACAGAGGAGAGTATTACTGGTCTGGACACCAG 960 

961 ATC ATGGCCT CTC CAGTTG GATTCAGCGGGC CCGAGTTT A 1000 

1001 CCTTTCCTCTCTATGGAACTATGGGAAACGCC5CTCCACA 1040 

1041 ACAACGTATCGTTGCTCAACTAGGTCAGGGTGTCTACAGA 1080 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCTTCA^TATCG 1120 

1121 GTATCAACAACCAGCAACTT-TCCGTTCTTGACGGAACAGA . 1160 

1161 GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTGTT 1200 

1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGAAATCC 1240 

1241 CACCACAGAACAACAATGT GCCACCCAGGCAAGGATTCTC 1280 

1231 CCACAGGTTGAGC CACGTGTCCATGTXCCGT TCCGGATTC 1320 

1321 AGCAACAGTTCCGTGAGCATCATCAGAGCTCCTATGTTCT 13 SO 

1361 CATGGATTCATCGTAGTGCTGAGTTCAACAATATCATTCC 1400 

1401 TTCCTCTCAAATCACCCAAATCCCATTGACCAAGtrCTACT 1440 

1441 AAC CTTG GATC TG G AACTT C T GTC GTGAAAG GAC CAG GCT 1480 

1481 TCACAGGAGGTGATATTCTTAGAAGAACTTCTCCTGGCCA 1520 
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1521 GATTAGCACCCTCAGAGTTAACATCACTGCACCACTTTCT 15 60 

1561 CAAAGATATCGTGTCAGGATTCGTTACGCATCTACCACTA 1600 

1601 ACTTGCAATTCCACACCTCCATCGACGGAAGGCCTATCAA 1640 

1641 TCAGGGTAACTTCTCCGCAACCATGTCAAGCGGCAGCAAC 1680 

1681 TTGCAATCCGGCAGCTTCAGAACCGTCGGTTTCACTACTC 1720 

1721 CTTTCAACTTCTCTAACGGATCAAGCGTTTTCACCCTTAG 1.760 

1751 CGCTCATGTGTTCAATTCTGGCAATGAAGTGTACATTGAC 1800 

1801 CGTATTGAGTTTGTGCCTGCCGAAGTTACCCTCGAGGCTG 1840 

1841 AGTACAACCTTGAGAGAGCCCAGAAGGC?GTGAACGCCCT 1880 

1881 CrTTACCTCCACCAATCAGCTTGGCTTGAAAACTAACGTT 1320 

1921 ACTGACTATCACATTGACCAAGTGTCCAACTTGGTCACCT 1960 

1961 ACCTTAGCGATGAGTTCTGCCTCGACGAGAAGCGTGAACT 2000 

2001 CTCCGAGAAAGTTAAACACGCCAAGCGTCTCAGCGACGAG 2040 

2041 AGGAATCTCTTGCAAGACTCCAACTTCAAAGACATCAACA 2080 

2081 GGCAGCCAGAACGTGGTTGGGGTGGAAGCACCGGGATCAC 2120 

2121 CATCCAAGGAGGCGACGATGTGTTCAAGGAGAACTACGTC 2160 

2161 ACCCTCTCCGGAACTTTCGACGAGTGCTACCCTACCTACT 2200 

2201 TGTACCAGAAGATCGATGAGTCCAAACTCAAAGCCTTCAC 2240 

2241 CAGGTATCAACTTAGAGGCTACATCGAAGACAGCCAAGAC 2280 
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22 81 C T T G AAA I CX A C T C GAT CAG G7 A CAA? G C CAAG CAC G AG A 2320 

2321 CCGTGAATGTCCCAGGTACTGGTTCCCTCTGGCCACTTTC 2360 

2361 TGCCCAATCTCCCATTGGGAAGTGTGGAGAGCCTAACAGA 2400 

2401 TGCGCTCCACACCTTGAGTGGAATCCTGACTTGGACTGCT 2440 

2441 CCTGCAGGGATGGCGAGAAGTGTGCCCACCATTCTCATCA 2480 

2481 CTTCTCCTTGGACATCGATGTGGGATGTACTGACCTGAAT 252 0 

2521 GAGGACCTCGGAGTCTGGGTCATCTTCAAGATCAAGACCC 2560 

2561 AAGACGGACACGCAAGACTTGGCAACCTTGAGTTTCTCGA 2 60 0 

2601 AGAGAAACCATTGGTCGGTGAAGCTCTCGCTCGTGTGAAG 2640 

2641 AGAGCAGAGAAGAAGTGGAGGGACAAACGTGAGAAACTCG 2680 

2681 AATGGGAAACTAACATCGTTTACAAGGAGGCCAAAGAGTC 2720 

2721 CGTGGATGCTTTGTTCGTGAACTCCCAATATGATCAGTTG 2760 

27 61 CAAGCCGACACCAACATCGCCATGATCCAC5CCGCAGACA 2800 

2801 AACGTGTGCAC AGCATTCGTGAGGCTTACTT GCCTGAGTT 284 0 

2841 GTCCGTGATCCCTGGTGTGAACGCTGCCATCTTCGAGGAA 2880 

2881 CTTGAGGGACGTATCTTTACCGCATTCXCCTTGTACGATG 2920 < 

2S21 CCAGAAACGTCATCAAGAACGCTGACTTCAACAATGGCCT 2960 

2561 CAGCTGCTGGAATGTGAAAGGTCATGTGGACGTGGAGGAA 3000 

3001 CAGAACAATCAGCGTTCCGTCCTGGTTGTGCCTGAGTGGG 3040 
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3041 AAGCTGAAGTGTCCCAAGAGG7TAGAGTCTGTCCAGGTAG 3080 



3081 AGGCTACATTCTCCGTG1GACCGCTTACAAGGAGGGATAC 3120 



3121 GGTGAGGGTTGCGTGACCATCCACGAGATCGAGAACAACA 3160 



3161 CCGACGAGCTTAAGTTCTCCAACTGCGTCGAGGAAGAAAT 3200 



3201 CTATCCCAACAACACCGTTACTTGCAACGACTACACTGTG 3240 



3241 AATCAGGAAGAGTACGGAGGTGCCTACACTAGCCGTAACA 3280 



3281 GAGGTTACAACGAAGCTCCTTCCGTTCCTGCTGACTATGC 3320 



3321 CTCCGTGTACGAGGAGAAATCCTACACAGATGGCAGACGT 3360 



3351 GAGAACCCTTGCGAGTTCAACAGAGGTTACAGGGACTACA 3400 



3401 CACCACTTCCAGTTGGCTATGTTACCAAGGAGCTTGAGTA 3440 



3441 CTTTCCTGAGACCGACAAAGTGTGGATCGAGATCGGTGAA 3480 



34B1 ACCGAGGGAACCTTCATCGTGGACAGCCTGGAGCTTCTCT 3520 



3521 TGATGGAGGAA 3531. 



Revendications 

1. Procede de modification d'une sequence de gene de structure du type sauvage qui code une proteine insecticide 
de Bacillus thuringiensis afin d'activer I'expression de iadite proteine chez des plantes qui comprend : 

a) I'identification de regions a I'interieur de ladite sequence comprenant plus de quatre nucleotides consecutifs 
d'adenine ou de thymine, 

b) la modification des regions de I'etape a) qui comportent deux ou plusieurs signaux de polyadenylation a 
I'interieur d'une sequence de dix bases afin d'eliminer lesdits signaux tout en conservant une sequence de 
gene qui code ladite proteine, et 

c) la modification des regions de 15 a 30 bases entourant les regions de I'etape a) afin d'eliminer les signaux 
majeurs de polyadenylation de plantes, les sequences consecutives contenant plus d'un signal mineur de 
polyadenylation et les sequences consecutives contenant plus d'une sequence ATTTA tout en conservant une 
sequence de gene qui code ladite proteine. 
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2. Precede de modification d'une sequence de gene de structure du type sauvage qui code une proteine insecticide 
de Bacillus thuringiensis afin d'activer ['expression de ladite proteine chez des plantes qui comprend : 

a) relimination des signaux de polyadenylation contenus dans led it gene de type sauvage tout en conservant 
5 une sequence qui code ladite proteine, et 

b) 1'elimination des sequences ATTTA contenues dans ledit gene de type sauvage tout en conservant une 
sequence qui code ladite proteine. 

3. Procede selon la revendication 2, comprenant en outre 1'elimination des sequences autocornplementaires et le 
10 remplacement de telles sequences par de. I'ADN non autocomplementaire comprenant des codons preferes des 

plantes tout en conservant une sequence de gene de structure codant ladite proteine. 

4. Procede selon les revendications 1 a 3, comprenant en outre I'utilisation des sequences preferees des plantes au 
cours de relimination des signaux de polyadenylation et des sequences ATTTA. 

5. Procede selon ies revendications 1 a 3, dans lequel les signaux de polyadenylation des plantes sont choisis parmi 
le groupe constitue de AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, ATACTA, ATAAAA, ATGAAA, AAGCAT, 
ATTAAT, ATACAT, AAAATA, ATTAAA, AATTAA, AATACA et CATAAA. 

20 6. Procede destine a ameliorer I'expression d'un gene heterologue chez des plantes dans lequel ledit gene comprend 
un gene chimere modifie comprenant un promoteur qui agit dans les cellules vegetales liees de faoon fonctionnelle 
a une sequence de structure codante et a une region 3' non traduite contenant un signal de polyadenylation qui 
agit chez des plantes pour provoquer I'addition de nucleotides de polyadenylate sur I'extremite 3' de I'ARN, dans 
lequel ladite sequence de structure codante code une proteine insecticide dont une partie au moins est derivee 

25 d'une proteine de Bacillus thuringiensis, dans lequel ledit procede comprend la modification de ladite sequence 

de structure codante de sorte que ladite sequence comporte une sequence d'ADN qui differe de la sequence 
d'ADN apparaissant dans la nature codant ladite proteine de Bacillus thuringiensis et ladite sequence de structure 
codante ne contient pas plus de 5 nucleotides consecutifs constitues de restes soit adenine, soit thymine. 

30 7. Procede d'amelioration de I'expression d'un gene heterologue chez des plantes dans leque! ledit gene comprend 
un gene chimere modifie comprenant un promoteur qui agit dans des cellules vegetales liees de facon fonctionnelle 
a une sequence de structure codante et a une region 3' non traduite contenant un signal de polyadenylation qui 
agit chez des plantes pour provoquer I'addition de nucleotides de polyadenylate sur I'extremite 3' de I'ARN, dans 
lequel ladite sequence de structure codante code une proteine insecticide dont au moins une partie est derivee 

3 5 d'une proteine de Bacillus thuringiensis, dans lequel ledit procede comprend la modification de ladite sequence 

de structure codante de sorte que ladite sequence comporte une sequence d'ADN qui differe de la sequence 
d'ADN qui apparaTt dans la nature codant ladite proteine de Bacillus thuringiensis et presente les caracteristiques 
suivantes : 

40 ladite sequence de structure codante comporte une region qui est complementaire de la sequence suivante : 



GGCTTGATT C CTAGCGAACT CTTCGATT CTCTGGTTG&TG&GCTGTTC 
45 1 5 10 15 20 25 30 35 40 45 

ladite region dans ladite sequence codante ayant elimine 2 sequences AACCAA et 1 sequence AATTAA. 

50 8. Procede selon la revendication 7, dans lequel ladite sequence de structure codante code une proteine insecticide 
dont au moins une partie est derivee de Bacillus thuringiensis kurstakis HD-1 . 

9. Procede selon la revendication 7 ou 8, dans lequel la plante est un plan de tabac. 

55 10. Gene chimere modifie contenant un promoteur qui agit dans des cellules vegetales liees de facon fonctionnelle a 
une sequence de structure codante et a une region 3' non traduite contenant un signal de polyadenylation qui agit 
chez des plantes pour provoquer I'addition de nucleotides de polyadenylate sur I'extremite 3' de I'ARN, dans lequel 
ladite sequence de structure codante code une proteine insecticide dont au moins une partie est derivee d'une 
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profeine de Bacillus thuringiensis, dans lequel iadite sequence tie slructure codante comporte une s 
d'ADN qui differe de la sequence d'ADN apparaissan; dans la nature codant Iadite proleine de Bacillus thur 
et est choisle a partir de : 

A. Un gene de structure qui code une proteine insecticide de 3.t.k. HD-1 comportant la sequence : 



w 



15 



20 



25 



30 



40 



45 



50 
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1 ATGGCTATAGAftACTGGTTACACCCCflATCGAIATTTCCr 40 

41 TGTCGCTAACGCAATTT'CTrTTGAGTGAa.TTTGTTCCCGG 80 

81 TGCTGGATTTGTGTTAGGACTAGTTGATATTATCTGGGGA 120 

121 AXT^mGGTCCCTCTCAATGGGACGCAlTrCTrGTACAAA 160 

1S1 TTGAACLGCTCATCAACCAGAGAATCGAAGAGTTCGCTAG 200 

201 GAATCAAGGCATTTCaGATTAGAAGGACTAAGCAATCTT 240 

241 TATCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAG 280 

281 ATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCA 320 

321 ATTCAATGACATGAACAGTGCCCTTACAACCGCTATTCCT 360 

361 CTTTTTGGAGTTCAAAAT'rATCAAGTTCCTCTCCTCTCCG 400 

401 TGTACGTTCAAGCTGCCAACCTCCACCTCTCAGTTTTGAG 440 

441 AGATGTTTGAGTGTTTGGACAAAGGTGGGGATTTGATGCC 480 

481 GCGACTATCAATAGTCGTTATAATGATTTAACTAGGCTTA 520 

' 521 TTGGCAACTATACAGATCATGCTGTACGCTGGTACAATAC 560 

5 61 GGGATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGG 600 

601 ATCAGGTACAACCAGTTCAGAAGAGAGCTTACACTAACTG 640 

641 TATTAGATATCGTTTCrCTATTTC CGAAC? ATGATAGTAG 680 

681 AACGTATCCAATTCGAACAGTTXCCCAATTAACAAGAGA& 720 



EP 0 385 962 B1 

721 A TTTATA CAAAC C CAGT A TTAG AAAATTTT GATG GT A GTT 7 60 

7 61 TTCGAGGCTCGGCTCAGGGCATAGAAJSGAACTATTAGGAG 80Q 

8 0 1 TCCACATT7GATGGATATACTTAATAGTATAACCAXCTAT 840 
841 ACGGArGCTCATAGAGGAGAArACTACTGGTCCGGTCACC 880 
831 AGATCATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATT 920 
321 CACTTTTCCGCTArATGGAACTAIGGGAAATGCAfiCTCCA S60 
961 CAAGAACGTATTGTTGCTCAACTAaSTCAGGGCGTGTArA 1000 

1001 GAACATTATCGTCCACCTTATATAGAAGAC C TTT' rA ACAT 1040 

1041 CGGGATCAACAACCAACAACTATCTb TT CTTGACGGGACA 1080 

1081 GAATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTG 1120 



136 



EP 0 385 962 B1 

1121 TATACAGAAAAAGC GGAA.C GGTAG ATTC GC IGGATGAAAT 1160 

11 SI ACCGCCACAGAArAACAACGTGCCACCTAGGCAAGGATTT 1200 

1201 AGTCAXCQAITAAGCCAIGTTOCaATGTTTCGrrCAGGCr 1240 

1241 T3aGTAATAGTAGTCTAAGTATAATAAGa.GCTCCTATGTT 12 S 0 

1281 CTCTTGGATACATCGTAGTGCTGAGTTCAACAACATCAIC 1320 

1321 CCTTCATCACAAATCACCCaAarcCGlCTCACCAAGTCTA 1360 

13 61 CTAATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGG 1400 

1401 ATTTACAGGAGGAGATAirCTTCGAAGAACTTCACCTGGC 1440 

1441 CAGATTTCAACCTTAAGAGTAAATATTACTGCACCATTAI 1480 

1481 CACAAAGATATCGGGTAAGAATTCGCTACGCTTCTACCAC 1520 

1521 AAACCTTCAGTTCCAC&CATCAATTGACGGAAGACCTATT 1560 

15 SI AATCAGGGGAATTTTTCAGCAACTATGAGTAGTGGGAGTA 1600 

1601 ATTTACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTAC 1640 

1641 TCCGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTA 1S80 

1681 AGTGCTCATGTCTTCAArTCA.GGCAATGAAGTTTATATAG 1720 

1721 ATCGAATTGAATTTGTTCCGGCA 1743. 

/ 

n gene de structure qui code une proteine insecticide de B.t.k. HD-73 comportant la sequence : 
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1 ATGGCCJJTTGAAACCGGTTACACTCCCATCGACA2TCTCCT' 40 

41 TGTCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGG 80 

81 TGCTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGT 120 

121 ATCTTTGCTCCATCTCAATGGGATGCATTCCTGGTGCAAA ISO 

1S1 TTGAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAG 200 

201 GAACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTC 240 

241 TACCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCG 230 

281 ATCCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCA 320 

321 ATrCAACGACATGAACAGCGCCTTGACCACAGCTATCCCA 360 

3 61 TTGTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCG 400 

401 TGTACGTTCAAGCAGCTAATCirrCACCrCa.GCGTGCTTCG 440 
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4 41 AGACGrTAGCGTGTTTGGGCAAAGGTG^GGA'rrCGATGCT 
481 GCMCCATCAAraGCCGTia 

521 TTGGAAACTACAC CGACCACGCTGTTCGTTGGTACAACAC 

561 TTCCTTGGAGCGTGTCTGGGGTCCTGaCTCTaGaGa.TTGS 

SOI ATTAGATACAACCAGTTCAGGA£AGAAT^^ 

641 ^TTTGGACATTGTGTCTCTCrTCCCGAACT^GACTCCaG 

631 AACCTACCCTA^CCGTACAGTC^CCCAACTtrACCAGAGAA 

721 ATCTATACTA^CCCAGTTCTTGaGAMiTTCGACGSTRGCT 

761 TCCGTGGTTCTGCCCAAGGTATCGAaGGCTCCATCaGGAG 

801 CCC^CaCTTGaTGGACATCTTG^CaGCMSACT&TCTAC 

841 ACCGATGCTCACAGAGGAGAGTXTTACTGGTCTGGACACC 

881 AGATC^IGGCCTCTCCAGTTGGAITCAGCGGGCCCGAGTT 

921 TAG CTTTCCTCTCTATGGAACTaTGGGAaACGCCGCTCCA 

961 CAACAACGTATCGTTGCTCAACTAGGTCAGGGTGTCTACA 

1001 GAACCTTGTCTTCCACCTTGTACAGAAGACCCTTCAA.OIAT 
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1041 CG^ATCAAC2^CCAGC^CTTTCCGTTCTTGaCGGAAC?- 1080 

1081 G AGTTCGCCTATGGAAC CTCTTCTAACTTG C CATC CGCT G 1120 

1121 TTTACAGAAAGAGCGGAACCGTTGArTCCTTGGACGAAAT 1160 

1161 CCCACCAC^GJyiCA&CAArGTGCCACCC^GGC&AGGATTC 1200 

1201 TCCCACAGGTTGAGCCACGTGTCCATGrrCCGTTCCGGAT 124Q 

1241 TCAGCAACAGTTCCGTGAGCATCATCAGAGCTCC7ATGTT 1230 

1281 CTCTTGGAIACACCGTAGTGCTGAGTTCAACAACATCATC 1320 

1321 GCA.TCCGATAGTAOTACTCAAATCCCTGCAGTGAAGGGAA 1360 

13 SI ACTTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATT 1400 

1401 CACTGGTGGAGACGTCGTTAGACTCAACAGCAGTGGAAAT 1440 

1441 AACATTCAGAATAGAGGGTATATTGAAG7TCCAATTCACT 14 80 

1481 TCCCATCCACATCTACCAGA3ATAGAGTTCGTGTGAGGTA 1520 

1521 TGCTTCTGTGACCCCTATTCACCTCAACGTTAATTGGGGT 15 60. 

1361 AAT^CATCCATCTTCTCCAATACAGTTCCAGCTACAGC'TA 1S00 

1601 CCTCCTTGGATAATCTCCAATCCAGCGATTTCGGTIACTT 1640 
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1 541 T GAAAGTGC CAAXGCTTTTACATCTTCACT CGGTAACATC 1580 

1631 GTGGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGArrA 1720 

1721 TCGACAGATTCGAGTTCATTCCAGrrACTGCAACACTCGA 1760 

17 61 GGCTGAG 1767 . 

C. Un gene de structure codant une proteine insecticide de B.tk. HD-1 comportant la sequence : 

1 ATGGACAACAACCCAAACa TCAACG&AJGQ^CTCCAT&CA 40 

41 ACTGCTTGAGTAACCCAGAAGTTGAAG^ 80 

81 ACGCATTGAAACCGGTTACACTCCCJLTCGACATCTCCTTG 120 

121 TCCTTGACACACTTTCTGCTCAGCGAGTTCGTGCCAGGTG ISO 

1S1 CTGGGTTCGTTCTCGGACTAGTTGaCA.TCATCTGGGCTAr 200 

. 201 CTTTGGTCCATCTCAATGGGATGCATTCCTGCTGCAAATT 240 

241 GAGCAGTTGATOiACCAGAGGATCGAAGAGTTCGCCaGGA 2S0 

2 81 ACCAGGCl^TCTCTAGGTTGGAAGGATTGAGCAATCrCTA 320 

321 CCAAArCTATGCAGAGAGCTTtZAGAGAGTGGGAAGCCGAT 360 
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'3 61 C CTACTAACC CAGCT CTC C GCGAGGAA&TGC GTAXTCAAT 4 00 

401 TCM.C^CATGAaCAGCGCCTTSaCCaCaGCTATCCCaTT 440 

441 GTTCGCAGTCCAGAACTACCaaGTTCCTCTCTTGTCCGTG 430 

481 TACGTTCA&GCAGCTAATCTTCAC^^ 520 

521 ACGTTAGCGTGTTTGGGCaAAGGTGGGGATTCSATGCTGC 560 

SSI A&CCA.TCAAX&GC CGTTACAi^GACCTIACTAGGCTGATT 500 

601 GGAAACTACACCGACC^GCTGTTCGTTGGTaCSACACTG 640 

641 GCTTGGAGCGTGTCTGGG^TCCTGAra^ 680 

681 TAG ATACAACCAGT7CAGGAGAGAA7TGAC C CT CACAG7T 720 

721 TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAGAA 7 50 

7 61 CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT 800 

8 0 1 CTATACT AACCCAGTTCTTGAGAACTTC GAC GGTAGCTTC 840 
841 CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 880 
8 81 C^CACTTGATGGACATCTTGAACAGCATAACTATCTACAC S20 
921 CGATGCTCACAGAGGAGAGTATTACTGGTCTGGACACCAG S60 
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' 3 61 ATClTGGCCTCTCC^GXxGGaTTCAGCGGGCCCaAGTTTii. IQOO 

1001 C CTTTCCTCTCTATGGAACTATGGGAAAC GC CGCTCCACA 1040 

1041 ACMCGTATCGTTGCTC^CTAGGTCAGGGTGTCTACAGA 1080 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCITCAAZ&TCG 1120 

1121 GTATCAACAACCAGCAACH'TTCCGTTCTTGACGGAACAGA 1150 

llffl GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTGTT 1200 

1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACSAAATCC 1240 

1241 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATrCTC 1280 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGATTC 1320 

1321 AGCAACAGTTCCCTGAGCArCATCAGAGCTCCTATGTXCT 13 SO 

13 51 CATGGATTCATCGTAGTGCTGAGTTCAACAATATCATTCC 1400 

1401 rTCCTCTCAAATCACCCAAATCCCA.TTGACCAAGTCTACT 1440 

1441 AACCTTGGATCTG^AACTTCTGTCGTGAAAGGACCAGGCT 1480 

1481 rC^CAGGAGGTGArATTCTTAGAAGAACTTCTCCTGGCCA 1520 

1S21 GATTAGCACCCTCAGAGTTAACATCACTGCACCACTTTCT 1560 
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15 61 CAAAGATArC G7GTCAGGATTCGTTACGCATCTACCACTA 1 6 0 0 
IffOl ACTTGCAATTCCACACCTCCATCGACGGAAG^CSATCAA 1S40 
1641 TCAGGGTAACTTCTCCGCAACCAIGTC^^ 1680 

1681 TTGCAATCCGGCAGCTTCAGAACCGTGJGT^^ ' 1720 
1721 CrTTCAACTTCTCTA&CGG&rC^ 1760 
17 61 CGCTt^rGTGTTCAATTCTGGCAATGaAGTGtaCaOTGAC 1800 
1301 CcnJAXTGAGTrrGTG^CTGCCGA&G^^ 1840 
1341 AGTAC 1345. 

gene de structure codant une proteine insecticide derivee de B.t.k. HD-73 comportant la sequence 

1 ATGGACAACAACCCAAACA3CAACGAA3C3CATTCCATAC^ 40 

41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACTirGGTGGAGA 80 

31 ACGCAT!TGAAACCGGTTACACTCC^TCGACATCTCCTTG 120 

121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCA5G7G 160 

161 CTGGGTTCGTTCTCGGACTAGTTGACAICATCTGG5GTAT 200 
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' 2 G 1 CTTTGGTCCATCTCAATGGGATGCAriCCTGGTGCAaAnrr 240 

241 GAGCAG7TGATCAAC CAGAGGATC GAAGAGTTCGCCAGGA 280 

281 ACCSGGCCAICTCTAGGTTGGJyLGGATTGAGC2lATCTCm 320 

321 CCA^TCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 3 SO 

3 SI CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 400 
401 TCAACGACATGAACAGCGCCTTGACCAaVGCTATCCCATT 440 
441 GTTCt^CTCCAGAACTACC^GTTCCtrCTCTTGTCCGTG 480 

4 81 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTCCTTCGAG 520 
521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGArrCGATGCTGC 550 

5 61 AAC CATCAATAGC CGTTACAACGAC CTTACTAGGCTGATT • 600 
601 GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 640 
641 ■ GCTTGGAGCGtTGTCTGGGGTCCTGATTCTAGAGATTGGAT 680 
681 T AGATACAACCAGTTCAGGAGAG AATTGAC CCTCACAGTT 720 
721 TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCA.GAA 7 SO 
7 61 CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT 800 
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'801 CTATACTAACCCAGTTCTTGiGAACTTCGACGGTAGCTTC 840 

841 CGTGGTTCTGCCCAAGGTATCGAAGGCTCCaTCaGGAGCC 830 

8 S 1 CACACTTGATGGACATCTTGAACAGC21TAACTATCTACAC 920 

921 CGATGCTCACAGAGGAGAGTAT7ACTGGTCTGGACACCAG 9 SO 

961 ATCATGGCCTCTCCAGTTGGATTCaGCGGGCCCGAGTTrA 1000 

1041 AC^CGTATCGTTGCTCAACTAG^CAGGGTGTCTACAGA 1080 

1081 accttgtcttcc^ccttgtacagaagacccttcaaiatcg 1120 

1121 gtatcaacaaccagcaactttccgttcttgacggaacaga 11 so 

11 si gttcgcctatggaacctcttcx&acttgccarccgctgtt 1200 

12 01 tac^gaaagagc;ggaaccgttgattccttggacgaaatcc 12 40 
1241 caccacagaacaacaatgtgccacccaggcaaggattctc 1230 
1281 ccaolggttgagccacgtg?ccatg!itccgt*rccgga3^c 1320 
1321 agcaacagttccgtgagcatqu^l&gagctcctaxgttct 1360 

13 61 CTTGGATACACCGTAGTGCTGAGTTCAACAACATCArCGC 1400 
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14 01 ArCCGATAGTArTACrC'^TCCCTGCAGTG/.AGGGAA^r. 1440 

1441 TTTCTCTTC^CGGTTCTGTt^TTTCAG^C£GGaOTCA 1480 

1431 CTGCTGG&GSCCTCfflTM^^ 1S20 

1521 aTX3kGMT2dS2JGGGI2X^^ 15 SO 

1601 CTTCTQTGaCCCCTATTCACCTCJ^CGTTAAOTGGGCTaA 1640 

1641 TTCATC(^CT?CTCCA&^^ 1S30 

1681 TCCTTGGATAATCTCCAATCCAGCGA^ 1720 

1721 AAAGTGCC^TGCTTTTAeATCTTCACTCGGTAACArCG'r 17 SO 

17 61 GGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGAtlTATC 1300 

1301 GACAGATTCGAGTTCATTCCAGTTACTGCAACACTCGAGG 1840 

1841 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTAATGCG 1880 

1881 CTGTTTACGTCTACAAACCAGCrrGGACTCAAGACAAATG 1920 

1221 G 1921. 

E. Un gene de structure codant la proteine insecticide en pleine longueur de B.t.k. HD-73 comportant la 
sequence : 
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1 AT^ACAAayLCCCVL^CATCAACGAAXGCATTCCATACA 4 0 

41 ACTGCTT GAGT AAC CCAGAAGTT GAAGTACTTGGTGGAGA 80 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATwTCCTTG 120 

121 TCtTTTGAC^CAGTTTCTGCTCAGCGAGTTCGTGCCJLGCTG 150 

151 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 

201 CTTrGGTCCATCTCAATGGGATGCATTCCTGGTGCAAATT 240 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 2 SO 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCrA 320 

321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 3 SO 

3 61 CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 400 

401 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 440 

441 GTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCGTG 430 

4S1 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

321 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 560 

561 AACC^TCAATAGCCGTTACAACGACCTTACTAGGCTGATT 500 
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SO 1 GG2iAACTACACCGACCACGCTGTTCGTTGGTAC^CACTG 
641 GCT^GGAGCGTCICTGGGGTCCTGJOTCTAGAC^TGGAT 

681 TAGATACAACCAGITCaGGaG^aaTTGaCCCTCaCAGTT 

721 TTGGAGATTGTGTCTCTCTTCCCGAACTAT6ACTCCAGSA 

761 CCTACCCTATCCGTACAGTGTCCCaACTTaCCaGAGAAaT 

801 CTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCTTC 

841 CGTGGTTCTGCCOLAGGTATCGAAGa^rcCATCAGSAGCC 

831 CACACTTGATGGACATCTTGAACAGCAIA^ 

921 CGATGCTCACAGAGGAGAGTATTACTGCTCTGGACACCAG 

961 ATCATGGCCTCTCCAGTTGGATTCAGCGGGCCCGAGTTTA 

1001 CCTTTCCTCTC7ATGGAACTATGGGAAACGCCGCTCCACA 

1041 ACAACGTATCGTTGCTCAACTAGGTC&GGGTGTCTACAGA 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCTTCAATArCG 

1121 GTAXCAACAACCAGCAACTTX CCGTTCTTGACGGAACAGA 

1161 GTTCGCC^ATGGAACCTCTTCTAACTTGCCATCCGCTGTr 
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'1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGSAaTCC 1240 

12 41 CACC^CAGAACAACAAIGTGCCACCCAGGCAAGGACTCTC 1230 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGATTC 1320 

1321 AGCAACAGTTCCGTGAGCATCATCAGAGCTCCTATGTTCT 1360 

1361 CrTGGATACACCSTACTGCTGAGTXCAACAACA.TCa.TCGC 1400 

1401 ATCCGAXAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 144 0 

1441 TTOCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATTCA 1430 

1431 CTGGTGGAGACCTCGTTAGACTCAACAGCACTGGAAATAA 1520 

1521 CATTC^GAATAGAGGGTATATTGAAGTTCCAATTCACTTC 1560 

15 61 CCATCCACATCTACCAGATATAGAGTTCGTGTGAGGTA3G 1600 

1601 CTTCTGTGACCCCTATTCACCTCAACGTrAATTGGGGTAA 1S4 0 

1641 TTCATCCATCTTCTCCAArACAGTTCCAGCTACAGCTACC 168-0 

1631 TCCTTGGATAATCTCCAATCCAGCGATTTC3GTTACTTTG 1720 

1721 AAAGTGCCAATGCTTTTACATC7TCACTC GGTAACATCGT 17 60 

17 61 GGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGATTATC 1800 
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■ 1801 GACAGATT CGA G TTCAT T C CAG XT ACT GCAACACT C GAGG 1S4 0 
1841 CTGAATATAATCTGGAAAGAGC GCAGAAGGCGGTG AATGC 1880. 
1381 GCTGTTTaCGTCTACaAACCaGCTCGGCCTCaaGacCaaP 1 92 0 
1921 GTGACGGATTATCATATTGAICAAGTGTCC^ I960 

1961 CCTiCCTGlGCGATG&GTTCTGTCXGGZlTGAaA&GCGSSA 2000 

2001 ATTGTCCGAGAAAGTCAA&CaTGCGA&GCGACTCAGTGSlT 204 0 

2041 GAACGC3ykT!raACrCCAAGATTCAAATOT CAAAGS.CATTS. 2080 

2081 AXAGGCAACCAGAAC5TGGGTGGGGCGGAAGTMIA.GGGAT 2120 

2121 TACCATCCAGGGAGGTGACGACGTGTTCAAGGAGAACrAC 2160 

2151 GTCACACTATCAGGTACCTTTGATG&OTGCTATCCAACAT 2200 

2201 ACCTCTACC^GAAGAICGACGAGTCCAAGTTGAAAGCCTT 2240 

2241 TACCCGTTATCAATTAAGAGGGTATAICGAAGATAGTCaA 2280 

2281 GACCTCGAGATCTACCTCATCCGCXACAATGCAAAACATG 2320 

2321 AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 23 60 

2361 TTCAGCCCAAAGTCCAATCGGAA&GTGTGGAGAGCCGAAT 2400 
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2 401 CGATGCGCGCCACACCTTGAATGGAA^CCTG^CrTAGATT 2 440 

2441 GTTCGTGTAGGSATGGAGAiUW^GTCKCCATC&TTCGCA 2430 

2481 TCATTTCTCCTTAGACAOTGATGTAGGMGTAQLG&CTTA 2520 

2521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGMTSAG21 2560 

2561 CGCaAG&TGGSCACGOLiUs^^ 2 600 

2 501 CGAAGAGAAACCATTAGTAGGAGAAGCGCTAGCTCGTGTG 2640 

2S41 AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAGAAGT 2680 

2681 TGGAATG5GAGACCAAC^CG7CTACAAAGA£3<XAAAAGA 2720 

2721 ATCTGTAGATGCTTTATTTGTAAACTCrCAAXATGATCAA 2760 

2 7 S 1 TTACAAGCGGATACGAATATTGCCATGAiTCATGCGGCAG 2300 

2801 ATAAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA 234 0 

2841 GC^GTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 2880 

2881 GAATTAGAAGGGCGTATTTTCACTGCA!rTCTCCCTCTACG 2920 

2921 ATGCCAGAAACGTCATCAAGAACGGTGACTTCAACAATGG 2960 

2961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 3000 
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3QQ1 GAACAAA^CAACCAACGTTCGGTCC^TGrTGTTCCG^AAX 304 0 

3041 GGGAa.GCAGJUGTGTCACiLAGiLa.GTTCGTGTCTGTCCGGG' 3080 . 

3081 TCGTGGCTATATCCTTC GTGTCACAGCGTACAAGGAGGGA 3120 

3121 TAIGGAGAAGGTTGCGTAACCATTCATGAGAICGAGAACA 3160 

3151 AIACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 3200 

3201 AATCTATCCAAATAACACGGTAACGTCTAATGATTATACT 3240 

3241 GTAAArCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 3280 

3281 ATCGAGGArAZAACGAAGCrCCTTCCGTACCAGCTGATTA 3320 

3321 TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 3360 

33 SI AGAGAGAATCCTTGTGAATTTAACAGAGGGTATAGGGATT 3400 

3401 ACACGCCACTACCAGTTGGTTATGTGACAAAAGAATTAGA 3440 

3441 AXACTTCCCAGAAACCGATAAGGTATGGATTGAGATTGGA 3480 

3481 GAAACGGAAGGAACATTTAICGTGGACAGCGTGGAATrAC 3520 

3S21 TCCTTATGGAGGAA 3534 . 

F. Un gene de structure codant une proteine insecticide en pieine longueur de B.t.k. HD-73 comportant la 
sequence : 
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41 ACTGCTTGaGT^CCCAG^GTTGJykGTaCTTGGTGGAGA SO 

81 ACGCATTGaAACCGGTTACACTCCCMCGaC^TCTCCTTG 

121 TCCTTGACACAGTTTCTGCTCAGCG2uGTICGTGCCAGGTG 

1 € 1 CTGGGTTCGTTCTCGGACTAGTTGa.C^CaTCTGGa3TJC 

201 C!!TTGGTCCATCTCA^TG(^ArGCATTCCTGGTGCA&ATT 

241 GAGCAGTTGATC^CCAGAGGATCGAAGASTTCGCC3y3GA 

281 ACC^GGCCATCTCTAGGTTGGAAGGAOTGAGC^TCTCTA 

321 CCAAATCTA2GCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 

3 61 CCTACTAACCCAGCTCTCCGCGAGGAAATGCG!EAITCAAT 
401 TCAACGACATGAACAGCGCCT2?GACCACAGCTATCCCA!TT 440 

4 41 GTrCGCAGTCCAGAACTACCAAGTTCCTCTCrrGTCCGTG 480 
4 81 TACGTrCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 



521 ACGTTAGCGTGTTTGGGCaAilG<GTGGGGa.TTCGATGCTGC 560 
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5 61 AACCATCAATAGCCGTTACAACGACCTTACTAGGCTGATT SGG 

601 GGAAACTACACCGACCACGCTGTTCGTIGGTACaACACTG 640 

641 GCXTGGAGCGTGTCTGGGGTCCTGATTCTAGAGAXTGGAT 530 

681 TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGT'r 720 

721 TTGGACAOTCTGTCTCTCTTCCCGAACXATGACTC CAGAA 7 60 

761 CCTaCCCTATCCCTACAGTGTCCCAACTTACCAGAGAAAT 800 

801 CTATAJ-TAACCQUn^rCTTGAGAA.CT?CGACGGTAGCTTC 84 0 

841 CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 880 



831 CACACTTGArGGACA-TCTTGAACAGCATAACTATCTACAC S2Q 

921 CGATGCTCACAGAGGAGAGTATTACTGGTCTGGACACCAG 960 

961 ATCATGKCTCTCCAGTTGGATTC^GCGGGCCCGAGTrrA 1000 

10Q1 CCTtrTCCTCTCTATGGAACTAIGGGAAACGCCGCTCCACA 1040 

1041 ACAACGTATCGTTGCTCAACTAGGTCAGGGTGTCTACAGA 1080 

1081 ACC*rrGTC7TCCACCTTGTACAGAAGACCCTTCAATATCG 1120 

1121 GTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAGA 1160 



25 



45 



1151 



GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTGTT 



1200 



50 



1201 



TACAGAAAGAGCGGAAC C GTTGATTwCTT GGAC GAAATCC 



124 0 



1241 



CACCACAGAACAACAATGTGCCACCCAGGCAAGGArrCTC 



1230 



55 



12S1 



CCACAGGTTGAGCCACGTGTCCATGTrCCGTTCCGGATTC 
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1321 AGCAACAGTTCC'GTGAGCATCATCJ^AGCTCCXATCT'tCT 13 GO 
1361 CTTGGATACACCGTAGTGCTGAGTTCAACAaCaTCaTCGC 14 QQ 
1401 ATCCGATAGTATTACTCAAATCCCTGCAGTGaAGGSAAAC 1440 
1441 TTTCTCTTCAAC GGTTCTGTCaTTTCAGGACCAGGaITCA 1480 
1431 CTGOTGGAGACCTCGTTAGACTCAACAGCACTGGAAATA& 1520 
1521 CATTCAGAATAGAGGGTATATTGAAGTrCCAATTCACTTC 1550 

1601 CTTCTGTGACCCCTAITCACCTCAACGTTAATTG5GGTAA _ 1S40 
1641 TTGATCCATCTTCTCCAATACAGTTCCAGCTACAGCTACC 1S80 
1681 TCCTTGGATAATCTCCAATCCAGCGATTTCGGTTACTrTG 1720 

' 1721 AAAGTGCCAATGCTTTTACATCTtTCACTCGGTAACATCGT 17 SO 

17 SI GGGTGTTAGAAACTTTAG1TGGGACTGCAGGAGTGATTATC 1800 

1801 GACAGATTCGAGTtZCAITCCAGTTACTG^ 1840 

1841 CTGAA^A3TAATCTGGAAAGAGCGCAGAAGGCGGTGAArGC 1380 

1881 GC TGTTTACGTCTACAAAC CAACTAGGGCTAAAAACAAAtT 1920 

1321 GTAACGGATTATCATAJTGATCAAGTGITCCAAatrTAGTTA 19S0 

IS 61 CGTATTTATCGGATGAArrTTGTCTGGASGAAAAGCGAGA 2000 

2001 ArTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 2040 

2041 GAACGCAATTTACTCCAAGATTCAAAI^PrCAAAGACATTA 2080 
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2031 ATAGGCAACCAGAACGTGGGTGGGGCGGAAGTAC&GGGAT 2120 

2121 TAC CAT C CAAGG AGG GGAT G AC GTATTT AAAGAAAATT AC 2160 

21 SI GTCSCACTATCAGGTACCTTTGATGAGTGCTATCCAACAT 2200 

2201 ATTTGTATCAAAAAATCGATGAATCAAAATTAAA&GCCTT 2240 

2241 TACCCOTATCAAT7AAGAGGGTATATCGAAGATAGTCAA 22 SO . 

2281 GACTTAGAAATCTATTTAA1TCGCTACAATGCAAAACATG 2320 

2321 AAA^GTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 2360 

2361 TTCAGCCCAAAGTCCAATC©3AAAGTGTGGAGAGCCGAAT 2400 

2401 CGA.TGCGCGCCACACCTTGAATGGAATCCTGACTTAGAOT 2440 

2441 GT7CGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 2480 

2481 TCATTTCTCCTTAG ACATTGATGTAG^ATGTACAGACTTA 2520 

2521 . AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA 2560 

25 61 CGCAAGAT GGG C ACGCAAGACTAGGGAAT CTAGAGTTTCT 2500 

2 601 CGAAGAGAAACCATTAGtTAGGAGAAGCGCTAGCTCGTGTG 2540 

2 641 AAAAG AGC GGAG AAAAAAT GGAG AG ACAAACGT GAAAAAT 2 530 

2 681 TGGAATGGG AAACAAATATCGTTTATAAAGAGGCAAAAGA 2720 

2721 ArCTGTAGATGCTTTATTTGTAAACTCTCAATATGATCAA 2760 

27 61 TTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAG • 2800 

2 801 ATAAACGTG7TCA7AGCATTCGAGAAGCTTATCTGCCTGA 2 S 4 0 
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2 S 4 1 GCTGTCTGTGATTCCGGGTGTCAATGCKCTTTTTTGAA 2880 

2831 GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTAXATG 2920 

'2321 ATGC l^GAiyiTGTC^TTAaAJLaTGGTGAritrTaATJiATGG 2 9 SO 

2SS1 CTTATC CTGCTGGAACSTGaAAGGGCAICT AGIjirGTASAA 3000 - 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTrGrTrc 3040 

3041 G«3AAGCAG2UlGTGTCacaAGAAGTTCGTGTCTGTCCGGC 3080 

30 SI TCCTGGCTAT^CCTTCGTGTCaCAGCGTACaAGGAGGGA 3120 

3121 iaTGGAGAAGGTT5CGTA&CCATTC^G&G&2CG&G&ACA 3150 

3161 AJACAGACGAACTGAAGpTTTAGCAACIGCGTAGSAG&fiG^ 3200 

3201 AATCTATCCAAATAAGACGStTAACGTGl^^ 3240 

3241 GTSAATCAAGAAGAaTACGGAGGTGCGTaC&CTTCTCGTA 3280 

3281 ATCGAGGATATjyiCGAAGCTCCTTCCGTACCAGCTGATTA 3320 

3321 TGCGTCAGTCrTArGAMAAaAATCGTATACAG&TGGACGA 3360 

33 61 AGAGAGAATCCTTGTGAAITTAACAGAGGGTATAGGGA2T 3400 

3401 ACACGCCACTACCAGTTGGTTArGTGACAAAAGAATTAGA 3440 

3441 ATACTTCCCAGAAACCGAXAAGGTATGGATTGAGATTGGA 348 0 

'3431 GAAACGGAAGGAACATTTATCGTGGACAGCGTGGAATTAC 3520 
3521 TCCTTATGGAGGAA 3534 . 
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i gene de structure codant une proteine insecticide en pleine longueur de B.tk. HD-73 compo 

1 ATGGACAACAACCCAAJkCATCAACGAATGCATTCCATACA 40 
' 4 1 ACTGCTTGAGTAACCCAGAAGTTGAAGTAC-TTGGTGGAGA 80 
81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 
121 T CC^TGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 
161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 

201 CTTTGCTCCATCTCAATGGGATGCATTCCTGGTGGA&ATT 240 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 280 . 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 320- 

321 CCAAATC7ATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 360 

361 CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 400 

4 01 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 440 

4 41 GTTCGCAGTCCAGAACTACCAAGTTCCrCTCTTGTCCGTG 480 

481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

521 ACGTTAGCGTGTI^GGGCAAAGGTGGGGATTCGATGCTGC 560 

561 AACCATCAAiAGC CGTTACAACGAC CTTACTAGGCTGA2T 60Q 

601 GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG' 640 

641 GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGGAI 630 



EP 0 385 962 B1 



301 
841 
881 



1001 
1041 
1081 
1121 
11 SI 
1201 
1241 
1281 
1321 
13 £1 
1401 



TAGAIACAAC CAGTrCAGGAGAGAAXTGACC CTCACAGTT 720 

TTSGACaiTCTSTCTCTCTTCCSGaACTArGACrCCaGaA 7 SO 

CCTACCCTAXCCGTACAGTCTCCC^CTI^CaGAGJL^ 800 

CTATKnrMCCCAGTTCTTGa.GMCTTCSACGCTAarTC 840 

CGTGGTTCTGCCCAAG<3TATCGAAGGCTCCS.TCS.GGAGCC 880 

CAC^CTTG^SGAC^CTTGAACaGCXraACTJlTCTSC&C 320 

Ct^GCTOtCAGAGGAGAGTATiaCTGGTCTGGaCACaiG 960 

ATCATGGCCTC7CCacrr^GG3LTTCAGCGGGCCCGAGTTTA 1000 

CCTTTCCTCTCT^TGGAACTATGGGAAACGCCGCrCC ACA 1040 



AC^CGTATCGTTGCTCAACTAGGTCAGGGTGTCTACAGa, 
ACCTTGTCTTCCACCTTGTACAGAAGACCCTTCAAXATCG 
GTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAGA 
GTTCGCCTArGGAACCTCTTCTAACTTGCCATCCGCTGTT 
TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGAAArCC 
CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCrC 
CCACAGGTTSAGCCACGTGTCCATGTTC CGTTCCGGACTC 
AGCAACAGTTCCGTGAGCATCA.TCAGAGCTCCTAXGTTCX 
CTTGGATACACCGTAGTGCTGAGTTCAACAACA3CATCGC 
ATCCGATAGTA^ACTCAAATCCCTGCAGTGAAGGGAAAC 



1120 

11 50 ■ 

1200 

1240 

1280 

1320 

13 SO i 

1400 
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14 4 1 TTTCTCTTCAACCGTTCTGTCATTTCAGGACCAGGATTC?. - 1480 
1481 CTGGTGCAGACCTCGTTAGACTCAACAGCAGTGGAAATAA 1520 
1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 1550 

15 61 CCATCCACATCTACCAGATATAGAGTTCGTGTGAGGTATG 1600 
1501 CTTCTGTGACCCCTATTCACCTCAACGTTAATTGGGGTAA 1640 

1S81 TCCTTGGATAATCTCCAATCCAGCGATTICGGTrACTTTG 1720 

1721 A&AGTGCCAATGCTTTTACATCTTCACTCGGTAACATCCT 1760 

1751 GGCTGTTAGAAACTrrAGTGGGACTGCAGGAGTGATTATC 1S00 

1801 GACAGATTCGAGTTCATTCCAGTTACTGCAS.CACTCGAGG 18 40 

1841 CTGAGTACAACCTTGAGAGAGCCCAGAAGGC7GTGAACGC 1380 

1881 CCTCTTTACCTCCACCAATCAGCTtrGGCrrGAAAACTAAC 1920 

1921 GTIACTGACTATCACATTGACCAAGTGICCAACTTGGTCA I960 

1961 CCTACCTTAGCGAIGAGT-CTGCCTCGACGAGAAGCGTGA 2000 

20C1 ACTCTCCGAGAAAGTTAAACACGCCAAGCGTCTCAGCGAC 2040 

2041 GAGAGGJLATCTCTTGCAAGACTCCAACTTCAAAGACATCA 2080 

2081 ACAGGCAGCCAGAACGTGGTTGGGGTGGAAGCACCGGGAT 2120 

2121 CACCATCCAAGGAGGCGACGATGTGTTCAAGGAGAACTAC 21S0 

2161 GTCACCCTCTCCGGAACTTTCGACGAGTGCTACCCTACCT 2200 
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2201 A CTTGTAC CA.GAAGAT C G A TGAGtT C CAAACT CAAA6C CT2 2240 

'2241 CACCAGGTATCA&CTTAGAGGCTACATCGaAGACAGCCAA 2280 

2281 GACCTTGAAASCTACTCGATCAGGTACA flTGCCAAGCACG 2320 

23 SI TTCTGCCCAATCTCCCArTGGGAAGTGTGGAGAGCCTAAC 2400 

24 01 AGATGCGCTCCACACCTTGAGTGGAATCCTGACTTGGACT 2440 
2441 GCTCCTGCAGGGATGGCGAGAAGTGTGCCCACCATTCTCA 2480 

24 SI TCACTTCTCCTTGGACATCGAXGTGGGAXGTACTGACCTG 2520 
2521 AATGAGGACCTCGGAGTCTGGGTCATCTTCAAGATCAAGA 2550 

25 61 CCC^GACGGACACGCAAGACTTGGCAACCTTGAGTTrCT 2SO0 
2 601 CGAAGAGAAACCATTGGTCGGTGAAGCTCTCGCTCGrGTG 2640 
2S41 AAGAGAGCAGAGAAGAAGTGGAGGGACAAACGTGAGAAAC 2SS0 
2 631 TCGAATGGGAAACTA&CATCGTTTACAAGGAGGCCAAAGA 2720 
2721 GTCCG7GGATGCTTTGTTCGTGAACTCCCAATATGATCAG 27 60 

27 SI TTGCAAGCCGACACCAACAXCGCCArGASCCACGCCGCAG 2800 
2801 ACAAAC G7GTGCACAGCATTCGTGAGGCTTACTTGCCTGA 2840 
2841 GTTGTCCGTGATCCCTGGTGTGAACGCTGCCArCTrCGAG' 2880 

28 81 GAACTTGAGGGACGIATCTTtrACa^^ 2S2Q 
2921 ATGCCSGAAACGTCATCAAGAACGG7GACTTCAACAATGG 29S0 
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2 3 61 CCTCAGCTGCTGGAATGTGAAAGGT CA7GTGGACG7GGAG 3CC0 

2001 G3AC\G2lAC3L?.TC^GCGrTCCGTCCTSGTTGrGCCTGaGT 3040 

3041 GGGAAGCTGAACTGTCCi^GAGCTTAGACTCTGTCCAGG 3080 

3081 TAGAGGC3SCATTCTCCGTGTGACC GCTTACAAGG&GGGA 3120 

3121 TACGCTGAGGGTTGCGTGaCCajCCACGAGATCGAGaACa. 31ff0 

31 SI AC2kCCGACGAGCTT2iAGTrCTCCSACTGCGTCGAGGAAGA 3200 

32 0 1 AATCT3^CC^CAAC^C{mSC2TGCS^GACTACACT 3 2 € 0 

32 41 GTGAATCAGGAAGAGTACGGikGSTGCCTA<^^ 3280 
3281 ACAGAGGTTAC^CG2^GCTCCTTCCGTTCCTGCTGACEa. 3320 
3321 TGCCTCCGTGTaCGAGGAGAAATCCTACaCAGATG^SCaGA 3360 

33 SI C^GAGAACCCTTGCGAGTTa^CAGAGGTTACAGGGACT 3400 
3 401 AC^CACCACTTCCAGTTGGCTATGTTACCaAGGAGCTTGA 3440 



3441 GTaCTTTCCTGAGACCGACAAAGTGTGGArCGAGATCGGT 3 4 S 0 
3481 GAAACCGAGGGAACOTCATCGTGGACAGCGTGGAGCTTC 3520 
3S21 TCCTGA2GGAGGAA 3334. 



n gene de structure qui code une proteine insecticide de B.t.t. Comportant La 
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I ATGACTGCAGACAACAACACCGAAGCCCTCGACAGTTCTA 40 

41 CC^CT^GG^GTTATCCAGAAGGGTATCTCCGTTGTGGG 80 

121 CTCGTGAGCTTCTASACA^CTTTCTC^CACCAirTTGGC 160 

161 GAAGCGAGGACCCTTGGAAAGCATTG^GGAGCAAGTTGA 200 

201 AGCTCTTATGGATCAGAAGMTGCAGAfrTATGCCAAGAAC 240 

241 AAGGCTTTGGCAGAACTCCAGGGCCTTCAGAACAATGTGG 280 

2 81 AGGACTACGTGAGTGCATTGTCCAGCTGGCAGAAGAACCC 320 
321 TCTTAGCTCCAGAAATCCTCACAGCCAAGGTAGGATCAGA 360 

3 61 GAGTTGTTCTCTCAAGCCGAATCCCACTTCAGAAATTCCA 400 



401 TOCCTAGCTTTGCTATCTCCGGTTACi 
441 QkCTACCTATGCTCAAGCTGCCAACACCCACTTGTTTCTC 4 SO 

481 CTTAAGGACGCTCAAATCTATGGAGAAGAGTGGGGAIACG 

521 AGAAAGAGGACATTT GCTGAGTTCTACAAGCGTCAACTTAA 

5 SI GCTCACC CAAGAGTACACTGACCATTGCGTGAAATGGTAT 

601 AAC GTT GGTCTC GATAAGCTCAGAGGCT CTTCCTACGAGT €40 

641 CTTGGGTGAACTTCAACAGAXACAGGAGAGAGJATGACCTr 58G 



440 



520 
5S0 
SCO 
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631 GKTTG^GCTCGATCTX^CGCAC^ 72Q 

721 GTGAGACTCTAC C QUlAGGAAGTGAAAACTGAGCrTACCA 760 

7 SI Gi<^CG^GCTCACXC^CCTJOTGTCGGaeTC&ACAaCCT 800 

801 TAGGGGT7i^GGA&Cl^CCT7C&£C^X22CGA&&&CT&C 840 

841 Ar?AG6AAACCACATCTCT7CGA£TAXCTTCACAGAATTC 330 

SSI AATTCCACaCAAGGTTrCAACi^GArACTaTGinaACGA 92 C 

921 CTCCTTCAACTATTGGTCCGCTAACTATGTTTCCACCAGA 950 

961 CCA&GCATTGGATCTAATGACRTCATCACATCTCCCTTCT 1000 

1001 ArGGTAACAAGTCCAGTGAACCTGTGCAGAACCTTGAGTT 1040 

1Q4L CAACGGCGAGAAAGTCTATAGAGCCGTCGCAAACACCAAT 1080 

1031 CTCSCTGTGTGGCCATCCGCAGTTTACTCAG^GTCACAA 1120 

1121 AffiSTGGAGTTTAGTCAGTAtEAACGArCAGACCGATGAGGC 1150 

1151 CAGCACCCAGACTTACGACTCCAAACGTAACGTTGGCGCA 1200 

1201 GTCTCTTGGGATTCTATCGACCAATTGCCTC CAGAAACCA 12 4 0 

1241 CAGACGAACCATTG<^GAAGGGCTACAGCCACCAAC?TAA 1280 

12 81 CTATGTGATGTGCTTCTTGATGCAAGGTTCCAGAGGGACC 1320 
1321 ATTCCAGTGTTGACCTSGACACACAAGTCCGTGGACTTCT 1350 

13 51 TCAACATGATCGAXAGCAAGAAGATCACTCAACTTCCCTT 1400 

14 01 GGTGAAAGCCT ACAAGCTGCAATCT GGTGCOTCC GTTGTC 1440 
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1441 GCAGGTCCCAGATTCACTGGAGGTGACATCA'rCCAGTGCA 14 SO 

14 31 CAGAGAACGGCAGCGCAGCTACTATCTACGTGACACCTGA 1520 
1521 TGTGTCTTACTCTCAGAAGTACAGGGCACGTATTCATTAC 15 60 

15 S 1 GCATCTACCAGCCAGATCACCTTCACACTCAGCITGGATG 1600 

iffOl GAGCACCCTTCAACX^GIATTAC^^ 1540 

1641 CAAAGGTGACACTCTCACAXACAAlAGCTTCAJkCCT 1 68 0 

1631 ACTTTCAGO^CCATTTGAACICTC^ 1720 

1761 CATCGACAAGATT GAGTTCATCCCAGTGAAC 1791. 

( 

I. Un gene de structure qui code une proteine insecticide de B.t. entomocidus comportant la sequence : 
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X AT G GAGG AG AACAAC CAAAAGCAAT GC5.TTC CA5ACAACT 40 
41 GCTTGAGTAACCCAGAAGAGGTATTGCTTGAltSffi^ 80 
31 CAXTTCAACC®3TASCTCrTCC^TCGACATCTCCTTGTCC 120 

121 TTGGTCCAGTTTCTGOTCAGC3ia.CTrCGTGCC3U^aTG(3TS ISO 

1S1 GGTTCCTTGTCGGACTaATTGACTTCGTCTGGGCTATCGT 200 

201 TGCTCCaTCTC^TGGGATGaiTTCCTGGTGCA&ATTGAG 240 

2 4 1 C^GTTGATCAaCGAGAGGATCGCTGAGTTCGCCAGGAaCG 280 

2S1 CTGCCA3CSCTAACTTGGAAGGATTGGGCAA.TA&CTTCAA. 320 

321 <^CrATGTGGAGGCCTTCAAAGAGTGGGAAGAGGACCCT 360 

3 SI AACAACCCAGAGACCCGCACTAGGGTGATCGACAGATTCA 400 • 
401 'GAATCTTGGACGGCCTCTTGGAGAGAGATATCCCArCCTT 440 

4 41 CAGAASCTCTGGCTTCGAAGTTCCTCTCTTGTCCGTGTAC 480 
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4 81 GCTCAAGCAGCTJL^CTTCACCTCGCTATCCTTCGAGACA 520 
521 GrGTCArCTrrGGGGAAAGG^GGGGAriGACCACTATCAA 5 SO 

5 61 CGTCAATGAGAATTACAACAGACTTATCAGGCACATTGAC 60Q 
SOI GAGTACGCCGACCACTGTGCTAACACCTACAACCGTGGCT S 4 0 
641 rGAACAATCTCCCTAAGTCrACTTATCAAGATTGGATTAC 58 C 
Sfll CTACAACAGGTTGAGGAGAGACTTGACCCTCACAGTTTTG 720 
721 GACATTGCAGCTTTCTTCCCGAACTATGACAACAGGAGAT 7 SO 

7 61 ACCCTATCCAACCAGTGG3TCAACTTACCAGAGAAGTCTA 300 
801 TACXGACCCACTTATCAACTTGAAC CCTCAGTTGCAAAGT 840 
841 GXCGCCCAACTTCCCACATTCAAC G TCATGGAGTCCAGCC 880 

8 SI GTATCAGGAACCCACACTTGTTTGACATCTTGAACAACCT 920 
'"921 TACTATCTTCACCGATTSGTTCAGCGTTGGGCGTAACTTC 9 SO 

561 TATTGGGGTGGACACAGGGTCArCTCCTCrCTTATTGGAG 1000' 

10 01 GTGGGAACATTAC CTCTCCTATCXATGGACGTGAGGCAAA 104Q 
1041 CCAGGAGCCACCACGTAGTTTCACCTTCAACGGTCCAGTC 1080 
10S1 TTCAGAACCTTGTCTAACCCTACCTTGAGArTGCTCCAGC 1120 
1121 AACCTTGGCCAGCTCCACCTTTCAACCTTAGAGGTGTTGA 11 SO 

11 SI GGGCGTTGAGTTCTCTACTCCTACCAACTCCTTCAC'rrAC 1200 
1201 AGAGGTAGAGGAACCGTTGATTCCTTGACCGAACTCCCAC 1240 
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1241 CAGAGGACAATAGCGTGC CACCCAGGGAAGGCTACTCCCA 

1281 CAGGTTGTGCCAC GCAACCTTCGTGCAGCGTTCCGGAACT 

1321 CC^rTCCTCACTACAG^GlTGTGTTCTCATGGACrGATC 

13 61 GTAGTGCiaCTCTCACTAai^C^TGaTCCCSAGaGGM 

1401 CAATCAAATCCCATTGGTCAAGGGTTTCCCTGTGTGGGGA 



GGA^CTTCTGTCATCACaGGaCC&GGCCTCACaGGAGGTG 
ATATTCTTAGAAGA^CaCTTTTGGCGaCTTTGTGAGCCT 



1521 CCAAGTTAACATCAACTCTCCAACTACTCAAAGA3S.TCGT 15 SO 

15 SI CTCAGGTTTCGtnrACGCATCTTCCCGTGACGCTAGAGTCa, 1600 

1S01 TCGTGCTCACCGGAGCAGCTTCTACCGGTGTCGGTGGACA 1640 

1541 ACTCTCCGTGAACATGCCACTCCAGAAGACTATGGAGATC 1630 

1681 GGCGAGAACTTGACATCCAGGACCTTCAGAIACACCGACT 1720 

17 51 TGGCATTAGCGAACAACCTCT C T TT GGAGCTGGTAGCATC 1800 

18 01 T CATCTGGCGLAATTGTACATTC^CAAGATTt^GAXCAXTC 1840 
18 41 TTGCCGACGCTACCTTCGAGGCTGAGTCTGACCrrGAGAG 1880 
13 81 AGC CCAGAAGGCTGTGAAC GCCCTCTTTACCTC CTCTAAT 1920 
1321 CAGATTGGClTGAAAACTGACGTTACTGACTATCACArrG I960 
1961 ACCAAGTGTCCAACrTGGTCGACTGCCZTAGCGATGAGTT 2000 



12S0 
1320 
1360 
1400 
1440 
1480 
152 0 
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ZOO I CrGCCICGACGAGJL\G^GTG^CrCTCCGAG^^GrTA?-A 2040 

2041 CACGCCAAGCGTCTCAGCGACGAGAGGAATCTCTrGCAAG 2030 

2081 ACCCCAACTTCAGAGGCATCAACAGGCAGCCAGACCGTGG 2120 

2121 TTGlSAGAGGAAGCACCGACAXC^ 21 SO 

21 £1 (^GTGT'K^GGAGAACTACGTCACCCTCCakGG^CrG 2200 

2201 TGGSCGaGTGCTACCCXI^CE&CTTOTACC&G&lUaoICGA 2240 

2241 TGAGTCCAAACTCAAAGCCTACACCAGGTATGAACTTAGA 228 0 

2281 GGCTACATCGAAGACAGCCAAGACCrTGAAATCTACCTCA 23 2 0 

2321 TCAGGTACAATGCCAAGCACGAGATCGTGJAATCTCCCAGG 2350 

23 SI TACTGCT^CCCTCtrGGCCACTTTCTGCCC^UUlTGCCCATT 2400 

24 01 GGGAAGTGTGGAGAGCCTAACAGATGCGCTCCACACCTTG 2440 
2441 ACTGGAATCCTGACTTGGACTGCTCCTGCAGGGATGGCGA 24S0 
2481 GAAGTCTGCCCACCATTCTCAtrCACTTCACC^GGACATC 2520 
2521 GATGTGGGAXGTACTGACCTGAATGAGGACCTCGGAGTCT 2560 

25 SI GGGTCATCTTCAAGATCAAGACCCAAGACGGACACGCAAG 2600 
2601 ACTTGGCAACCTTGAGTTTCTCGAAGAGAAACCA.riGCTC 2640 
2641 GGTGAAGCTCTCGCTCGTGTGAAGAGAGCAGAGAAGAAGT 2 680 
2681 GGAGGGACAAACGTGAGAAACTCCAACTCGAGACTAACAT 2720 
'2721 CGTTTACAAGGAGGCCAAAGAGTCCGTGGATGCTTTGTTC 27S0 
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27 61 GTGAACT C CCAATATGATAGGTXGCAAGTGGACACCAACA 2300 

2801 T C GCCATGAT C CAC SGTGCAGACAAAC GTGTGCACAGGAT 2840 

2841 TCGTGAGGCrTACrTGCCTGAGTTCTCCGTCyiTCCCTGGT 2880 

2881 GTGAACGCTGCCATCTTCGAGGAACTTGAGGSACGTMCT 292Q 

2921 TTACCGCATACTCClTGTACG^GCt^GAAACGTCAXCAA 2960 

2S€1 GAACGGTGACTTCAACAATGGCCTCOTGTGCTGS^GTG 3000 

3001 AAAGGTCATGTGSACGTGGAGGAACAGAACAATCACCGTO 3040 

3041 CCGTCCTGGTTATCCCTGAGTGGGAAGCTGAAGTGTCCCA 3080 

3081 AGAGGTTAGAGTCTCTCCaGGTAGAGGCTACAITCTCCGT 3120 

3121 GTGACCGOT^ACAAGGAGGGATACGGTGAGGGTTGCGTGA 3 1 60 

3161 CCATCCACGAGATCGAGX1&CAACACCGACGAGCTXAAGTT 3200 

32Q1 CTCCAACTGCGTCGAGGAAGAAGTCTATCCCAACAACACC 3240 

3241 GTTACTTGC^CAACTACACTGGGACCCAGGAAGAGTACG 32 SO 

3281 AAGGTACC-ACACTAGCCGTAACCAAGGTTACGACGAAGC 3320 

3321 TaACGGAAACAATCCTTCCGTTCCTGCTGACTATGCCTCC 3350 



33 61 GTGTACGAGGAGAAATCCTACACAGATGGCAGACGTGAGA 3400 
3401 ACCCTTGCGAGTCCAACAGAGGTTACGGTGAOSCACACC 3440 

34 41 ACrrCCAGCAGGCTATGTTACCAAGGACCrrGAGTACTrr 3 480 
34 31 CCTGAGACCGACAAAGTGTGGArCGAGATCGGTGAAACCG 3520 
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3521 AGGGAA.CCTTCATCGTGGACAGCGTGGAGC?TCrCTTGAT 

5 

35 51 GGAGGAA 35 57. 

10 J. Un gene de structure qui code - une proteine insecticide P2 comportant ia sequence : 

15 
20 
25 
30 
35 
40 

50 
55 
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41 GCGACSZaTACAACGTCGTGGCTCACGATCCArrCAGCTr SO 

81 CGAAC&CAAGAG C CTCGA.CACTATTCAGSAGGAisTGGa.TG 120 

121 GAATGGAAACGTACTGACCACTCTCTCTACGTC!XACCTG 160 

151 TGGTTGGAACAGTGTCCAGCTTCCTTCTCIAAGAAGGTCGG 200 

201 CTfCTCTCATCGGAAAACGTATCTTGTCCGAACTCTGGGGT 240 

'241 A.TCATCTTTCCATCTGGGTCCACTAATCTCATGCAAGACS. 230 

281 TCZTGAGGGAGACCGAAOiGTTTCTCAACCAGCGTCTCAA 320 

321 CACTGATACCTTGGCTAGAGTC^CGCTGAGTTGATCGGT 360 

3 SI CTCCSAGCAAACAIITCSTGAGTTCAJ^ 4 00 

4 01 ACTTCTTGAATCCAACTCAGAATCCTGTGCCTCTTTCCAT 440 
441 CACTTCTTCCGTGAACACTATGCAGCAACTCTTCCTCAAC 4 80 
481 &GA.TTGCCTCAGTTTCAGATTCAAGGCTAC CAGTTGCTCC 520 
521 TTCTTCCACTCTTTGCTCAGGCTGCCAACAXGCACTTGTC 5 SO 

5 SI CTTCATACGTGACGTGATCCTCAACGCTGACGAATGGGGA £00 
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601 ATCTCTGCAGCCACTCTTAG^-C^TACAGAQACT3.CTTGA S4 0 

641 GGAACTACACiCGTGMTACTCCAACTATTGCASCAACaC S80 

681 TTATCAG ACTGC CTTXCGTGGACT CAATACTAGGCTTCAC 720 

721 GACATGCTTGAGTTCAGGACCrACATGTTCCTTAACGTGT 760 "• 

761 TTGAGTACGTCAGCATTTGGAGTCTCTTCAAGTACCAGAG 800 

801 CTTGATGGTGTCCTCTGGAGCCAATCTCTACGCCTCTGGC 84 0 

' 841 AGTGGACCaCAGCAAACTCAGAGCTTCACAGCTCAGAACT 880 

381 GGCCAJTCTTCTATAGCTTGTTCCAAGtrCAACTCCAAC^A ' 920 

921 CACTCTCACTGGTATCTCTGGGACCAGACTCTCCATAACC 960 

361 TTTCCCAACAITGGTGGACTTCCAGGCTCCACEACAACCC 1000 

1001 ATAGCCTTAACTCTGCOiGAGTGAACTACAGTGGAGGTGT 1040 

1041 CAGCTCTGGATTGATTGGTGCAACX2UICTTGAACCACA&C 1080 

1081 TTCAATTGCTCCACCCTCTTGCCACCTCTGAGCACACCGT 112 0 

1121 TTGTGAGGTCCTGGCTTGACAGCGGTACTGATCGCGAAGG 11 SO 

1161 AGTTGCTACCTCTACASACTGGCAAACCGAGTCCTTCCAA 1200 

1201 ACCACTCTTAGCCTTCGGTGTGGAGCTTTCTCTGCACGTG 1240 

1241 GGAATTCAAACTAC2TTC CAGACTACTTCATTAGGAACAT 1280 

1281 CTCTGGTGTTCCTCTCGTCATCAGGAATGAAGACCTCACC 1320 

1321 CGTCCACTTCAnACAACCAGATTAGGAACATCGAGTCTC 1360 
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13 61 C?JTCCGGTACTCCAGGAGGTGCAAGAGCTTACCTCGTGrC 14 00 

14 Gl TGTCCATAACAGGAAGAACAACATCTACGCTGCCAACGAG 144 0 

1441 iUkTGGCACCATGATTCACCTTGCACC^GAAGATTAC^CTG 1480 

1481 GArTCaCCAJCTCTCOUkTCCATGCTACCCAAGTGAACaA 1520 

1521 TCAGACACGCACCTTCATCTCCSaAAAGTTCGGaAATCaA 1550 

1561 GGTGACTCCTTGAGGTTC^GC^TCCJ^AC*rACCGCTA 1500 

1601 GG7ACACTTIGAGAGGCAATGGAAACAGCTACAACCTTTA 1540 

1S41 CrrTGAGAGrrAGCTCCArTGGTAACTCCACCArCCGTGTT 15S0 

1S31 ACCATCAACGGACuTGTTTACAC^GTCTCTAArGTGAACA 1720 

1721 CTACAAC GAACAATGArGGCGTTAACGACAACGGAGCCAG 1750 

17 61 ATTCAGCGACATCAACATTGGCAACATCGTGGCCTGTGAC 1800 

1801 AACACTAACSTTACrrTGGACATCAATGTGACCOTCAATT 1840 

1 S 4 1 CTGGAACTCCArrTGATCTCATGAACATCATGTTTGTGCC 1880 
1381 AACTAACCTCCCTCCATTGTAC 190 2 ° U 



K. Une sequence de gene de structure codant une proteine de fusion comprenant les acides amines 610 N- 
terminaux de B.t.k. HD-1 et ies acides amines 567 C-terminaux de B.t.k. HD-73, ledit gene comportant la 
sequence : 
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41 ACTGCTI<aAGTAACCCAGA£.GTTG2lA<3TACXTGG!TGGAGa. 80 

81- ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCtTTG 120 

121 TCCTrGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 150 

1 SI CTGGGTTCGTTCTCGGACTAGTTGar^CATCTG^GTSI 200 

201 CTTTG^CCATCTC^rGGGATGCATTCCTGkn'GCAAATT 240 

241 GAGCAGTTGArCAAC CAGAGGATCGAAGAGTTCGCCAGGA 280 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAAICTCTA 320 

321 CCAAATCTA3GCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 3 SO 

3 61 C CTACTAACCCAGCTCTCCGCGAGGAAAIGC GTATTCAAT 400 

401 TC^CGACATGA&CAGCGCCTIGACCACAGCTATCCCAin: 440 



50 



176 
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4 41 GTTCGCAGrCCAGJJLCTaCC^AGTTCCTCrCTTGTCCGTG 4 80 
481 TACGTTCAAGCAGCTA^CTTCACCTCAGCGTGGTTCGAG 520 " 
521 ACGTTAGCGTGTTTGGGOUlAGGTGGGGATTCGi^^ 560 

5 SI AACCATCAAILAGCCGTTACA&CGACCTTACT^ SOO 
601 G^AAACTACACCGACC^CGCTGrrCGTTGGTACaACACTG 540 
541 GCTTGGAGCGTGTCTGG^TCCTIGATTCTAGaGArrGGaT ' 68 0 
681 rAGATACAACCAGTTC^GGAGAGA&TTGaCCCTCACAGTT 720 
721 TTGGACATTCrrGTCTCTCTTCCC.GAACTarGaCTCCAGAA 750 
7 61 CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT SOO 



35 



45 



55 
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SOI C TATACTAAC C CAGTT CTT GAGAACTT C GAC &GTAGCTT C 340 

841 C GTGGTTCTGCCCAAGG7ATCGAAGGCTCCATCAGGAGCC 880 

881 CACACTTGATGGACATCTTGAACAGCATAACTATCTACAC 920 

921 CGATGCXCACAGAGGAGAGTATTACTGGTCTGGACAdCAG 960 

961 ATCZArGGCCTCTCCAGTXGGATTCAGCGGGCCCGAGTTTA 1000 

1001 CCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCACA 1040 

1041 ACA&CGTATCGTTGCTC^CTAGGTCAGGGTGTCTACAGA 1080 

1081 ACCTtTGTCTTCCaCCTTGTACAGAAGACCCTTCAATATCG 1120 

1121 G?ATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAGA 11 SO 

1151 GTrCGCCXATGGAACCTCTTCTAACTTGCCATCCGCTGXT 1200 

1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACSAAArCC 1240 

1241 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCTC 1280 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGACTC 1320 

1321 AGCAACAGTTCCGT^GCATCATCAGAGCrCCTATGTrCT 13 60 

13 61 CATGGATTCATCGTAGTGCTGAGTTCAACAATATCATTCC 1400 

1401 TTCCTCTCAAATCACCCAAATCCCATTGACCAAGTCTACT 1440 

1441 AACCTTGGATCTGGAACTTCTGTCGTGAAAGGACCAGGC™ 1480 

14 SI TCACAGGAGGTGATATTCTTAGAAGAACTTCTCCTGGCCA ' 1520 
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1521 GAITAGCACCCTCSGAGTTJUlC^CaCTGC\CCACrTTCT 

15 61 CAAAGATAiCGTGTCAGGAITCGTraCGCAS'CTACCACTA 

1501 ACTX GCAATTC CACACCTC CATCGACGGAAGGCCTAT CAA 

1642 



1681 TTGCAATCCGGCAGCTTCaGMCC!n , CGGlT2^CraCTC 
1721 CTTTC&aCTTCTCSaj^Gg&TCaAgCgTTTTC^CCyil^ 
1 7 SI CGCTCATGTGTTCA&TTCTGfiaCAATGA&GTGrAanTlIAC 
1801 CGTATTGAGTTTCTGCCTGCCGAAGTT&CCCTCGAGGCTG 
1841 AGTACAACCTTGAGAjGaGCCCAGAAGGCTGrGaACGCCCT 
1881 CTTTACCTCCACCAATCAGCTT^ 

1921 ACTGACTATCACATTGACCAAG7GTCCAACTTGGTCACCT 
1961 ACCTTaGCGATGAGTTCTGCCTCGACGAGAAGCGTGAACr. 
2001 CTCCGAGAAAGTTAAACACGCCAAGCGTCTCAGCGACGAG 
2041 AGGAATCTCTTGC&AGACTCCAACTTCSAaG^ 
2081 GGCAGCCAGAACGTGGTTGGGGTGGAAGCACCGGGATCAC 



2121 CATCCAAGGAGGC GACGArGTSTTCAAGGAGAACTACGTC 
2151 ACCCTCTCCGGAACTTT CG ACGAGTGCTACCCTACCrACT 
2201 TGTACCAGAAGATCGATGAGTCCAAACTCAAAGCCTTCAC 
2241 CAGGTATCAACTTAGAGGCSACaTCGAAGACAGCCAAGAC 



1630 
1720 
1.7 SO 
180Q 
1S40 
1380 
1320 
I960 
2000 
2040 
2080 
2120 
2160 
2200 
2240 
2280 
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2281 C7TS^JL?.TCTACTCGLArC^ 2320 

2321 CCCTGAATGTCCCAGGTACTGGTTCCCTCT^ 23 SO 

2361 TGCCC^CTCCCATTSGSAaGTGTGGAGAGCCiaAakGA 2400 

2401 TGCOTCCSC^CTTGaCTGGM.TCCTGACl^GGaCTGCT 244.0 

2441 CCTSCa(KGaT5SCG2Uaa(nGTSCCCaCCATTCTCATCa 2480 

2431 CTTtnrCCTXQSACSZCS^GTGGGATGTACTGaCCTGaA'T 2S2Q 

2521 G&GGACCTCGGAGTCXGGGTCXTCTTCAAGATC^ 2560 

25 SI AAGAGGGACACGCAAGACTTGGCAACCTTGAGTTTCTCGA 2 SO 0 

2641 iaGC^aGA^GaAGTGGAGG^CAAaCGTSaGAAACTCG 2S80 

2681 AATGGGAAaCTAACATCGTTTACAAG^G^CajUVGAGTC 27 2 Q 

2721 CGTGGJ^GCTTTGTTCGTGAACTCCCAATATGATCaGTTG 27 50 

27 51 C^GCCGACACCAACATCGCCATGATCCACGCCGCAGACA 2800 

28 01 AACCTGTGCACAGCArrCGTGAGGCTrACTTGCCTGAGTT 2840 
2841 GTCCGTGATCCCTGCTGTGAACGCTGCCATCTTCGAGGAA 2880 
2881 CTTGAGGGaCGTATCTTTACCGC^TTCTCCTTGTJU^GATG 2S2Q-. 
2S21 CCAGAAACGTCASCAAGAACGGTGACTTC^CAATGGCCT 2950 
2361 QkGCTGCTGGAA-GTGAAAGGTwA^GTGGACGTGGAGQAA 3000 
3001 CAGAACAATCAGCGTrCCGTCCTGGTTGTGCCTGAGTGGG 3040 
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3041 ^G€TGAAGTGTCCCAAGAGGTTAGAGTCTGTCaiGGl!aG 
3081 AKCTACA!TCCTCCGTCTI^^ 
3121 

3161 CCG2LCfflGCTTA&(7ETCTCCiaCTGCGTCG^SSM5aaM 

32Q1 CTOTCCAAOUIC&CCOTK^ 

3241 AA3a!£G2AG^£CQ»&6C^ 

3281 GAGG!TTACAACG2UIGCTCCT2^^ 

3321 CTCCGTGTMGAGC3AGa&ATCCTACACAGArGGC2^GaCGT 

33 SI GAGaACCCTTGCGAGTTCaACAGaGGTTAC&GGGACT^^ 

3401 CI^CACTICCAGTTGGCTA^^ 



3441 CTTTCCTGAGACCGAOtfUU^ 
3521 TGATGGAGGAA 3531. 



55 
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1 ATGGCTATAGAAACTGGTTACACCCCAATCGATATTTCCT 4 0 

41 TGTCGCTAACGCAATTTCTTT7GAGTGAATTTGTTCCCGG 80 

81 TGCTGGAT7TGTG7TAGGACTAG7TGATATAATATGGGGA 120 
T C 

121 ATTTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAA 160 

161 TTGAACAGTTAAT7AACCAAAGAATAGAAGAATTCGCTAG 200 
C C C G C G 

2 01 GAACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTT 2 40 
T 

241 TATCAAATTTACGCAGAATC7TTTAGAGAGTGGGAAGCAG 230 

2 31 ATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCA 320 
321 ATTCAATGACATGAACAGTGCCCTTACAACCGCTATTCCT 3 60 

3 61 CTTTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAG 4 00 

CC C C 

4 01 TATATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAG 440 

G C C CC C CC C 

441 AGATGTT7CAGTGTTTGGACAAAGGTGGGGATTTGATGCC 4 80 

4 81 GCGACTATCAATAGTCGTTATAATGATTTAACTAGGCTTA 520 
521 T7GGCAACTATACAGATCATGCTGTACGCTGG7ACAATAC 5 60 

5 61 GGGATT AGAGCG T GT AT GGGG AC C GGATTCT AGAG ATT GG 600 

601 ATAAGATATAATCAATTTAGAAGAGAATTAACACTAACTG 64 0 
CGCCGC GCT 

541 TATTAGATATCGTTTCTCTATTTCCGAACTATGATAGTAG 680 

681 AAC GTAT C C AAT T C G AAC AGTTTC CC AATT AAC AAG AG AA 720 
FIGURE 2A 
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721 ATTTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTT 7 60 

7 61 TTCGAGGCTCGGCTCAGGGCATAGAAGGAAGTATTAGGAG 800 

301 TCCACATTTGATGGATATACTTAATAGTATAACCATCTAT 84 0 

841 ACGGATGCTCA7AGAGGAGAA7ATTAT7GGTCAGGGCATC 380 
C C C T C 

881 AAATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATT 920 
G C 

921 CACTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCA 960 

961 CAACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATA 1000 

1001 GAACATTATC GTC C ACCTT AT AT AG AAGAC CTTTT AAT AT 10 40 

. C 

1041 AGGGA7AAATAATCAACAACTATCTGTTCTTGACGGGACA 1080 
C C C C 

1081 GAATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTG 1120 

1121 TATACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAAT 1160 

1161 ACCGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTT 1200 

1201 AGTCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCT 124 0 

1241 TTAGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTT 1280 

1281 CTCTTGGATACATCGTAGTGCTGAATTTAATAATATAATT 1320 
G C C C C C 

1321 CCTTCATCACAAATTACACAAATACCTTTAACAAAATCTA 13 60 
C C C AC C C G 

1361 CTAATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGG 14 00 
FIGURE 2B 
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1401 ATTTACAGGAGGAGATATTCTTCGAAGAACTTCACCTGGC 14 40 

14 41 CAGATTTCAACCTTAAGAGTAAATATTACTGCACCATTAT 14 80 

1481 CACAAAGATATCGGGTAAGAATTCGCTACGCTTCTACCAC 1520 

1521 AAATTTACAATTCCATACATCAATTGACGGAAGACCTATT 15 60 
CC T G C 

15 61 AATC AGGGGAATTTTTCAGC AACTATGAGT AGTGGG AGTA 1600 
1601 ATT7ACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTAC 1640 
1641 TCCGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTA 16SC 
1681 AGTGCTCATGTCTTCAATTCAGGCAATGAAGTTTATATAG 1720 
1721 ATCGAATTGAATTTGTTCCGGCA 1743 

FIGURE 2C 
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1 AT G G A T AAC AAT C C G AAC AT C AAT G AAT C-C AT T C C T 1 AT A 4 0 
CCA C AC 

4 1 ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 80 
C C G A T C T 

81 AAGAATAGAAACTGGT7ACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

121 TCGCTAACGCAATTTCTTTTGAGTGAATT7GTTCCCGGTG 160 
CTGAG GCCCGCGA 

161 CTGGATTTGTGT7AGGACTAGTTGATATAATATGGGGAAT 200 
GCTCC CCC T 

C A T C G G 

241 GAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGA 230 
G GC GGC G C 

281 ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 320 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 SO 
C C T GAGC C C 

361 CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 400 
C TC CC C G A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 4 40 
C C T G C A C AT 

441 TTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 4 80 
GC CGCC C G C G 

481 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 520 
C AT C T CC CAGC GC TC 

521 AT GTTTC AGT GTTT GG AC AAAGGT GGGG ATTT G ATGC C GC 5 60 
C AGC G C T 

561 GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 600 
AC C CCCCT G 

601 GGCAACTATACAGATcATGCTGTaCGCTGGTACAATACGG 64 0 
A CCCCC TT . C T 

641 GATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGGAT 680 
C G C T T 



FIGURE 3 A 
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681 AAGATATAATCAAT 



TAACACTAACTGTA 



721 TTAGATATCGTTTCTCTATTTCCGAACTATGATAGTAGAA 

G C T G C C CTCC 

7 61 CGTATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 

CCTCT G CTC 

801 TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 
C T TCTGCCC CC 

8 41 CGAGGCTCGGCTCAGGGCATAGAAGGAAGTATTAGGAGTC 

T T T C A T C CTCC C C 

881 CACATTTGATGGATATACTTAATAGTATAACCATCTATAC 
C CCTGCC T C 

921 GGATGCTCATAGAGGAGAAT ATTATTGGTCAGGGC ATC AA 
C C GC TACG 

961 ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATTCA 
C C ATA CAGC C G T 

1001 CTTTTCCGCTATATGGAACT ATGGGAAATGCAGCTCC ACA 
CTC C C 

1041 ACAACGTATTGTTGCTCAACTAGGTC AGGGCGTGTATAGA 
C T C C 

1081 ACATTATCGTCCACCTTATATAGAAGACCTTTTAATATAG 
C G T G C C C C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAC 



C C C G 



T C 



ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGT. 
G C C T T C 



TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 
G C T CT C C 



CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 
A C T C CTC 



12 81 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 
CCAGG CGC C CAC 



13 21 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 
C C TCC G C C C 



13 61 CTTGGATACATCGTAGTGCTGAATTTAATAATATAATTCC 
AT G C C C 

FIGURE 3B 
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14 01 TTCATCACAAATTACACAAATACCTTTAACAAAATCTACT 14 40 
CT CC CAGCG 

1441 AATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGGAT 1480 
C A G C 

1481 TTACAGGAGGAGATATTCTTCGAAGAACTTCACCTGGCCA 1520 
C T A T 

1521 GATTTCAACCTTAAGAGTAAATATTACTGCACCATTATCA 15 60 
AGC CC TCC CTT 

1561 CAAAGATATCGGGTAAGAATTCGCTACGCTTCTACCACAA 1600 
T C G T A A 

1601 ATTTACAATTCCATACATCAATTGACGGAAGACCTATTAA 1640 
C G ' CCCC G C 

1641 TCAGGGGAATTTTTCAGCAACTATGAGTAGTGGGAGTAAT 1680 
T C C C C TCA CCCC 

1681 TTACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTACTC . 1720 
G A C CACC C 

1721 CGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTAAG 17 60 
T C CTC CTCCCT 

17 61 TGCTCATGTCTTCAATTCAGGCAATGAAGTTTATATAGAT 1800 
C G T G C T C 

1801 CGAATTGAATTTGTTCCGGCAGAAGTAACCTTTGAGGCAG 1840 
T G GTC T C T 

1841 AATAT 1845 
G C 



FIGU5LS 3C 
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1 ATGGATAACAATCCGAACATCAATGAATGGA7TCCTTATA 4 0 
CCA C AC 

41 ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 8 0 
C C G AT C T 

81 AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

121 TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 160 
CTGAG GCCCGCGA 

161 CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT 200 
GCTCC CCC T 

201 TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 240 
C A T C G G 

2 41 GAAC AG TT AATT AACC AAAG AAT AG AAG AATT C GCT AGG A 230 
G GC GGC G C 

281 ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 320 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 60 
C C T GAGC C C 

361 C C T ACTAAT CC AGC ATT AAG AG AAG AG ATGC G TATT C AAT 400 
C TC CC C G A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 440 
C CTGCA CAT 

441 TTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 4 80 
GC CGCC CGCG 

481 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 520 
C AT C 7 CC CAGC GC TC 

521 ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 560 
C AGC G C T 

561 GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 600 
A C C C C CC T G 

601 GGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG 64 0 
A CCCCC TT CT 

641 GATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGGGT 680 
C G G C T T A 



FIGURE 4A 



190 



EP 0 385 962 B1 



681 AAGGTATAATCAATT7AGAAGAGAATTAACAC7AACTGTA 720 

TACCGCG GCCAT 

721 TTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAGAA 7 60 
G C T GT C " C CTCC 

7 61 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 800 
CCCTCT G CTC 

801 TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 84 0 
C T TCTGCCC CC 

841 CGAGGCTC GGCTCAGGGC AT AG AAAG AAGTAT TAGGAGTC 880 
TTTCATC G CTCC C C 

881 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 
C C CT G C T C 

921 GG ATGC TC AT AGGGGT7 ATT ATT ATTGGT GAG GGC AT C AA 960 
C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATTCA 1000 
C C ATA CAGC C G T 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 104 0 
CTC C C 

1041 ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 1080 
C T C C 

1081 ACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAG 1120 
CGT CGC CC C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGA 1160 
TCCCG TC' A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 1200 
G C C T T C T 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 124 0 
G C T CT C C 

1241 CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 1280 
A C T C CTC 

1231 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 1320 
CCAGG CGC C CAC 

1321 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 13 60 
C C TCC G C C C 

1361 CTTGGATACATCGTAGTGCTGAATTTAATAATATAATTGC 1400 
C G C C C C C 
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1401 ATCGGATAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 14 40 
C 

1441 TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA 1480 

c c c c c 

14 81 CTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAATAA 1520 

A C C C C C C 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 15 60 

15 61 CCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATG 1600 

C A GA 

1601 CTTCTGTAACCCCGATTCACCTCAACGTTAATTGGGGTAA 164 0 



1641 TTCATCCATTTTTTCCAATACAGTACCAGCTACAGCTACG 1680 
C C T C 

1681 TCATTAGATAATCTACAATCAAGTGATTTTGGTTATTTTG 17 20 
C G C C C C C 

1721 AAAGTGCCAATGCTTTTACATCTTCATTAGGTAATATAGT 17 50 

c c c c 

17 61 AGGTGTT AGAAATTTTAGTGGGACTGCAGGAGTGATAAT A 1 S 0 0 
G C T C 

1801 GACAGATTTGAATTTATTCCAGTTACTGCAACACTCGAGG IS 4 0 
C G C 

1841 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 1880 

A TGCG 

1881 GCTGTTTACGTCTACAAACCAACTAGGGCTAAAAACAAAT 1920 
CTGT AGGTCTACA C AGCT G ACTC G CA TG 

1921 G 1921 
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1 GAAAGAATAGAAACTGGTTACACCCCAATCGATATTTCCT 4 0 
ATGGCC T C T C C C 

41 TGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGG 80 
CT GAG GC C C G C G A 

81 TGCTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGA 120 
GCTCC CCC T 

121 ATTTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAA 160 
C A T C G G 

161 TTGAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAG 2 00 
G G-C GGC G C 

201 GAACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTT 2 40 
G C G G T G C 

241 TATCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAG 2 80 
C C T GAGC C C 

2 81 ATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCA 3 20 

C TC CC C G A 

321 ATTCAATGACATGAACAGTGCCCTTACAACCGCTATTCCT 3 60 
C CTGCA CA 

3 61 CTTTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAG 4 00 

TGC CGCC CGC 

401 TATATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAG 4 40 
G C AT C T CC CAGC GC TC 

4 41 AGATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCC 4 80 

■ C AGC G C T 

481 GCGACTATCAATAGTCC-TTATAATGATTTAACTAGGCTTA 520 
AC C CCCCT G 

521 TTGGCAACTATACAGATTATGCTGTACGCTGGTACAATAC 5 50 
A C C CC C TT C 

5 61 GGGATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGG 600 

T C G G C T T 

601 GTAAGGTATAATCAATTTAGAAGAGAATTAACACTAACTG 64 0 
ATACCGCG GCCA 

641 TATTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAG 6 80 
T G C T GT C C CTCC 
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681 AAGATATCCAATTCGAAC AGTTTCCCAATTAACAAGAGAA 720 
CC C T C T G C T C 

721 ATTTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTT 7 60 
C T TCTGCCC C 

7 61 TTCGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAG 800 
CTTTCATC G CTCC C 

801 TCCACATTTGATGGATATACTTAACAGTATAACCATCTAT 84 0 
C C CCTG C T C 

841 ACGGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATC 880 
C CAAGG C TAC 

881 AAATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATT 920 
G C C ATA CAGC C G 

921 CACTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCA 960 
T C T C C C 

961 CAACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATA 1000 
C T C C 

1001 GAACATTATCGTCCACTTTATATAGAAGACCTTTTAATAT 1040 
CGT CGC CC 

1041 AGGGATAAATAATCAACAACTATCTGTTCTTGACGGGACA 1080 
CTCCCG TC A 

1031 GAATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTG 1120 
G C C T T C 

1121 TATACAGAAAAAGCGGAACGGTAGAT7CGCTGGATGAAAT 1160 
T G C T CT C 

1161 ACCGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTT 12 00 
C A C . T C C 

1201 AGTCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCT 12 40 
TCC CA G G- CGC C C A 

1241 TTAGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTT 1280 
C C C TCC G C C C 

1281 CTCTTGGATACATCGTAGTGCTGAATTTAATAATATAATT 1320 
C G C C C C C 

1321 GCATCGGATAGTATTACTCAAATCCCTGCAGTGAAGGGAA 13 60 
C 

1361 ACTTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATT 14 00 
C C C C 
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1401 TACTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAAT 14 40 
C ACC CCCC 

1441 AACATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACT 1480 



1431 TCCCATCGACATCTACCAGATATCGAGTTCGTGTACGGTA 1520 
C A GA 

1521 TGCTTCTGTAACCCCGATTCACCTCAACGTTAATTGGGGT 1560 
G T 

15 61 AATTCATCCATTTTTTCCAATACAGTACCAGCTACAGCTA 1600 
C C T 

1601 GGTCATTAGATAATCTACAATCAAGTGATTTTGGTTATTT 164 0 
CCG C CC C C 

1641 TGAAAGTGCCAATGCTTTTACATCTTCATTAGGTAATATA 1680 

C C C C 

1681 GTAGGTGTTAGAAATTTTAGTGGGACTGCAGGAGTGATAA 17 20 
G C T 

1721 TAGACAGATTTGAATTTATTCCAGTTACTGCAACACTCGA 17 60 
C C G C 

1761 GGCTGAA 17 67 
G 
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1 A TGG AT AAC AAT C C G AA CATC AAT G AAT G C A 7 T C C T T A T A 4 0 
CCA C AC 

41 ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 30 
C C G A T C T ' 

81 AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

121 TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 160 
CTGAG GCCCGGGA 

161 CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT 200 
G C TC C CCC T 

201 TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 2 40 
C A T C G G 

241 GAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGA 280 
G GC GGC.G C 

281 ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 320 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 60 
C C T GAGC C C 

3 61 CCTA.CTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 400 
C TC CC C G A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 4 40 
C C T G C A C AT 

441 TTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 4 80 
GC CGCC CGCG 

481 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 520 
C A T C T CC CAGC GC TC 

521 ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 5 50 
C AGC G C T 

561 GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 500 
A C C CCCCT G 

601 GGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG 54 0 
A C C CC C T T CT 

641 GATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGGGT 6S0 
C G G C T T A 
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6 81 AAG GT AT AAT C AAT T 7 A G AAG AG AAT T AA C A C T AA C T GT A 7 2 0 

TACCGCG GCCAT 

721 TTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAGAA 7 60 
G C T GT C C CTCC 

7 61 GAT AT CC AATT C G AAC AGTTTC C C AATT AAC AAG AG AAAT 300 

CCCTCT G CTC 

801 TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 840 
C T TCTGCCC CC 

S41 CGAGGCTCGGCTCAGGGCATAGAAAGAAGTAITAGGAGTC 880 
TTTCATC G CTCC C C 

881 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 
C C CT G C T ' C 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 9 60 
C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATTCA 100 0 
C- C ATA CAGC C G T 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 104 0 
CTC C C 

1041 ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 108 0 
C T C C 

1081 ACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAG 112 0 
CGT CGC CC C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGA 1160 
TCCCG TC A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 12C0 
G C C T T C T 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 12 4 0 
G C T CT C C 

1241 CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 123 0 
A C T C CTC 

1281 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 132 0 
CCAGG CGC C CAC 

1321 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 1360 
C C TCC G C C C 

13 61 CTTGGATACATCGTAGTGCTGAATTTAATAATATAATTGC 14 00 
C G C C C C C 
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14 41 TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA 14 80 

c c c c c 

1481" CTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAATAA 1520 
A C C C C C C 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 15 60 

15S1 CCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATG 1600 
C A GA 

1601 CTTCTGTAACCCCGATTCACCTCAACGTTAA7TGGGGTAA 1<4Q 
G T 

1641 TTCATCCATTTTTTCCAATACAGTACCAGCTACAGCTACG 1680 
C C T C 

1681 TCATTAGATAATCTACAATCAAGTGATTTTGGTTATTTTG 1720 
C G C C C C C 

1721 AAAGTGCCAATGCTTTTACATCTTCATTAGGTAATATAGT 17 60 
C C C C 

17 61 AGGTGTTAGAAATTTTAGTGGGACTGCAGGAGTGATAATA 1800 

G C T C 

1301 GACAGATTTGAATTTATTCCAGTTACTGCAACACTCGAGG 184 0 
C G C 

1841 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 1380 

18 31 GCTGTTTACGTCTACAAACCAACTAGGGCTAAAAACAAAT 192 0 
1921 GTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTA 1960 
1961 CGTATTTATCGGATGAATTTTGTCTGGATGAAAAGCGAGA 2000 
2001 ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 204 0 
2041 GAACGCAATTTACTCCAAGATTCAAATTTCAAAGACATTA 2080 
2081 ATAGGCAACCAGAACGTGGGTGGGGCGGAAGTACAGGGAT 2120 
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2121 TACCATCCAAGGAGGGGATGACGTATTTAAAGAftAATTAC 2160 

2161 GTCACACTATCAGG7ACCTTTGATGAGTGCTATCCAACAT 2200 

2201 ATTTGTATCAAAAAATCGATGAATCAAAATTAAAAGCCTT 2240 

22 41 TACCCGTTATCAATTAAGAGGGTATATCGAAGATAGTCAA 2 28 0 

22 81 G ACTT AG AAAT CT ATTT AATTCG CT AC AATGC AAAAC AT G 2320 
2321 AAACAGTAAATGTGCCAGGTAC.GGGTTCCTTATGGCCGCT 2 3 60 

23 61 TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 2 400 
2401 CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 2 4 40 

24 41 GTTC GTGTAGGGATGG AG AAAAG TGTGCC C ATC ATTCGC A 2 4 80 

24 81 TCATTTCTCCTTAGACATTGATGTAGGATGTACAGACTTA 2 52 0 
2521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA 25 60 

25 61 CGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCT 2 600 
2 601 CGAAGAGAAACCATTAGTAGGAGAAGCGCTAGCTCGTGTG 2 54 0 
2641 AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAAAAAT 2 68 0 
2681 TGGAATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGA 27 2 0 
2721 ATCTGTAGATGCTT7ATTTGTAAACTCTCAATATGATCAA 27 60 
2761 TTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAG 2800 
2801 ATAAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA 28 40 
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2 841 GCTGTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 2 88 0 

2881 GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTATATG 2 920 

2921 ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 2 960 

2961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 3 000 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 3040 

3041 GGGAAGCAGA^GTGTCACAAGAAGTTCGTGTCTGTCCGGG 3030 

30S1 TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 312 0 

3121 TATGGAGAAGGTTGCGTAACCATTCATGAGATCGAGAACA 3160 

3161 ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 3200 

3201 - AATCT ATCC AAAT AAC AC GGT AAC GTGT AAT G ATT AT AC T 3240 

3241 GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 3280 

3281 ATCGAGGATATAACGAAGCTCCTTCCGTACCAGCTGATTA 3320 

3321 TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 33 60 

3361 AGAGAGAATCCTTGTGAATTTAAC AGAGGGTATAGGGATT 3 4 00 

3401 ACACGCCACTACCAGTTGGTTATGTGACAAAAGAATTAGA 344 0 

3441 ATACTTCCCAGAAACCGATAAGGTATGGATTGAGATTGGA 3 4 30 

3481 GAAACGGAAGGAACATTTATCGTGGACAGCGTGGAATTAC 3 52 0 

3521 TCCTTATGGAGGAA 3534 
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1 ATGGATAACAATCCGAACATCAATGAATGCA7TCCTTATA 4 0 
CCA C AC 

41 ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 8 0 
C C G A T C T 

81 AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

121 TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 160 
CT GAG GCCCGCGA 

161 CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT 200 
GCTCC CCC T 

201 TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 2 40 
C A T ' C G G 

241 GAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGA 230 
G GC GGC G C 

281 ACCAAGCCA7TTCTAGATTAGAAGGACTAAGCAATCTTTA 32 0 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 60 
C C T GAGC C ■ C 

3 61 CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 4 00 

C TC CC C G A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 4 40 
C CTGCA CAT 

441 TTTTGCAGTTCAAAATTA7CAAGTTCCTCTTTTATCAGTA 4 80 
GC CGCC CGCG 

4 81 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 520 

C A T C T CC CAGC GC TC 

521 ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 5 60 
C AGC G C T 

561 GACTATCAATAG7CGTTA7AATGATTTAACTAGGCTTATT 600 
AC C CCCCT G 

601 GGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG 64 0 
A C C CC C TT CT 
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641 GATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGGG7 63 0 
C G G C T T A 

681 AAGGTATAATCAATTTAGAAGAGAATTAACACTAACTGTA 720 
TACCGCG GCCAT 

721 TTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAGAA 7 SO 
G C T GT C C CXCC 

7 61 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 300 
CCCTCT G CTC 

801 TTATACAAACC CAGT ATTAGAAAATTTTGATGGTAGTTTT 84 0 
C T TC TGCCC.CC 

841 . CGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAGTC 880 
TTTCATC G CTCC C C 

8S1 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 
C C CT G C T C 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 9 50 
C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGGT7TTCGGGGCCAGAATTCA 1000 
C C ATA CAGC C G T 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 1040 
CTC C C 

1041 ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 10 3 0 
C T C C 

1081 ACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAG 1120 
CGT CGC CC C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGA 1160 
TCCCG TC A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 12 00 
G C C T T C T 

1201 TAC AG AAAAAGC GGAAC GGT AG ATTC GCT GG ATGAAATAC 12 4 0 
G C T CT C C 

1241 CGCC ACAG AAT AAC AAC GTGC C AC CTAGG C AAGG ATTT AG 1280 
A C T C CTC 

1281 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 132 0 
CCAGG CGC C CAC 

1321 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 13 60 
C C TCC G C C C 
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13 61 CTTGGATACATCGTAGTGCTGAATTiAATAATATAATTGC 3.40C 

C G C C C C C 

14 01 ATCGGATAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 144 0 
C 

14 41 TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA 14 80 

c c c c c 

14 81 CTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAATAA 1520 
A C C C C C C 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCAC7TC 15 60 



15 61 CCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATG 1600 
C A GA 

1601 CTTCTGTAACCCCGATTCACCTCAACGTTAATTGGGGTAA 1640 
G T 

1641- TTCATCCATTTTTTCCAATACAGTACCAGCTACAGCTACG 1630 
C C T C 

1681 TCATTAGATAATCTACAATCAAGTGATTTTGGTTATTTTG 1720 
C G C C C C C 

1721 AAAGTGCCAATGCTT7TACATCTTCATTAGGTAATATAGT 17 50 
C C C C 

17 61 AGGTGTTAGAAATTTTAGTGGGACTGCAGGAGTGATAATA 180 0 

G C T C 

1301 GACAGATTTGAATTTATTCCAGTTACTGCAACACTCGAGG 184 0 
C G C 

18 41 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 18 8 0 



1881 GCTGTTTACGTCTACAAACCAACTAGGGCTAAAAACAAAT 192 0 
G C C C G C 

1921 GTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTA 1960 
G C G G 

1961 CGTATTTATCGGATGAATTTTGTCTGGATGAAAAGCGAGA 2000 
C CC CAGC G C 

2001 ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 2040 



2041 GAACGCAATTTACTCCAAGATTCAAATTTCAAAGACATTA 2 08 0 
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2 081 ATAGGCAACCAGAACGIGGGTGGGGCGGAAGTACAGGGAT 212 0 

2121 TACCATCCAAGGAGGGGATGACGTAITTAAAGAAAATTAC 2160 
G TC GCGGC 

2161 GTCACACTATCAGGTACCTTTGATGAGTGCTATCCAACA? 2200 

2201 ATTTGT ATC AAAAAATC G ATG AATC AAAATT AAAAGCCTT 22 4 0 
CCCCGG CGCGG 

22 41 TACCCGTTAJTCAATTAAGAGGGTATATCGAAGATAGTCAA 228 0 

22 81 GACTTAGAAATCTATTTAATTCGCTACAATGCAAAACATG 232 0 

C C G CC C C 

2321 AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 23 60 

23 61 TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 2 400 
2 4 01 CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 24 4 0 

24 41 GTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 2 4 8 0 

24 81 TCATTTCTCCTTAGACATTGATGTAGGATGTACAGACTTA 252 0 
2521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA "25 60 

25 61 CGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCT 2600 

2601 CGAAGAG AAAC CAT T AG TAGGAG AAGCGCT AGCTC G TGTG 264 0 

2 641 AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAAAAAT 2 63 0 

G G 

2 681 TGGAATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGA 27 2 0 
G C C C C 

2 721 ATCTGTAGATGCTTTATTTGTAAACTCTCAATATGATCAA 2 7 60 

27 61 TTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAG 2800 
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2 801 A7AAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA ' 2840 

2841 GCTGTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 2 880 

2881 GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTATATG 2 920 

C C 

2 921 ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 2 960 
C C. CGC CCC 

2961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 3000 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 3040 

3041 GGGAAGCAGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG 3080 

3081 TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 3123 

3121 TATGGAGAAGGTTGCGTAACCATTCATGAGATCGAGAACA 3160 

3161 ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 3200 

3201 AATCTATCCAAATAACACGGTAACGTGTAATGATTATACT 3 24 0 

3241 GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 32 80 

32 81 ATCGAGGATATAACGAAGCTCC7TCCGTACCAGCTGATTA 3 32 0 

3321 TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 33 50 

3361 AGAGAGAATCCTTGTGAATTTAACAGAGGGTATAGGGATT 3 4 00 

3401 ACACGCCACTACCAGTTGGTTATGTGACAAAAGAATTAGA 3440 

34 41 ATACTTCCCAGAAACCGATAAGGTATGGATTGAGATTGGA 3 480 

3481 GAAACGGAAGGAACATTTATCGTGGACAGCGTGGAATTAC 3 52 0 

3521 TCCTTATGGAGGAA 35 3 4 
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1 ATGGATAACAATCCGAACATCAATGAATGCATTCCTTATA 4 0 
CCA C AC 

41 ATTGTT T AAGT AACC CTG AAGTAG AAG TAT TAGGTG GAGA 80 ' 
C C G A T C T 

81 AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

121 TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 160 
CT GAG GC C C G C G A 

161 CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT- 200 
GCTCC CCC T 

201 TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 240 
C A T C G G 

241 GAACAG TTAATT AACCAAAGAATAGAAGAAT7CGCTAGGA 2 80 
G GC GGC G C 

281 ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 320 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 50 
C C T GAGC ' C C 

3 61 CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 4 00 
C TC CC C G ' A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 4 40 
C C T G C A C AT 

441 TTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 4 80 
GC CGCC CGCG 

481 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 520 
C A T C T CC CAGC GC TC 

521 ATGTTTCAGTG7TTGGACAAAGGTGGGGATTTGATGCCGC 550 
C ' AGC G C T 

561 GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 600 
AC C C C CC T G 

601 GGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG 64 0 
A CCCCC TT CT 

641 GATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGGGT 680 
C G G C T T A 
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681 AAJSGTATAATCAATTTAG AAGAGAATT AAC ACTAACTGTA 7 2 0 
TACCGCG C-CCAT 

721 TTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAGAA 7 60 
G C T GT C C CTCC 

7 61 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 800 
CCCTCT G CTC 

801 TTATAC AAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 8 40 
C T TCTGCCC CC 

841 CGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAGTC 8 80 
TTTCATC G CTCC C C 

881 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 
C C CT G C T C 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 9 60 
_C CAAGG C TACG 

9 61 ATAATGGCTTCTCCTGTAGGGTT7TCGGGGCCAGAATTCA 1000 
C C ATA CAGC C G T 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 1040 
CTC C C 

10 41 ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 1080 
C T C C 

10 81 ACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAG 1120 
CGT CGC CC C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGA 1160 
TCCCG TC A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 12 00 
G C C T T C T 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 1240 
G C T CT C C 

1241 CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 1280 
A C T C CTC 

1281 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 1320 
CCAGG CGC C CAC 

1321 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 13 50 
C C TCC G C C C 

13 61 CTTGGATACATCGTAGTGCTGAATTTAATAATATAATTGC 14 00 
C G C C- C C C 
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14 01 ATCGGATAGTATTRCTCAAATCCCTGCAGTGAAGGGAAAC 14 4 0 



14 41 TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA- 14 80 
C C C C C 

14 81 CTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAATAA 1520 
A C C C C C C 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 1560 



15 61 CCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATG 1600 
C A GA 

1601 CTTCTGTAACCCCGATTCACCTCAACGTTAAT7GGGGTAA 1640 
G T 

1541 TTCATC C ATTTTTT CC AAT ACAGT ACCAGCTAC AGCTAC G 1680 
C C T C 

1681 TCATTAGATAATC7ACAATCAAGTGATTTTGGTTATTTTG 172 0 
C G C C C C C 

1721 AAAGTGCCAATGCTTTTACATCTTCATTAGGTAATATAGT 17 60 
C C C C 

17 61 AGGTGTTAGAAATTTTAGTGGGACTGCAGGAGTGATAATA 1300 
G C T C 

1801 GACAGATTTGAATTTATTCCAGTTACTGCAACACTCGAGG 134 0 
C G C 

1841 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 18-80 
GCCTG C T C 

1831 GCTGTTTACGTCTACAAACCAACTAGGGCTAAAAACAAAT 1920 
CC CCCTGTCTG TC 

1921 GTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTA 1960 
TTC C C CGC 

1951- CGTATTTATCGGATGAATTTTGTCTGGATGAAAAGCGAGA 2000 
C CC TAGC G C C C C G T 

2001 ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 204 0 
CC T CC T CC 

2 041 GAACGCAATTTACTCCAAGATTCAAATTTCAAAGACATTA 2080 
GA G C C7 G CCC C 

2081 ATAGGCAACCAGAACGTGGGTGGGGCGGAAGTACAGGGAT 2120 
C G T T C C 
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2121 TACCATCCAAGGAGGGGATGACGTAT7TAAA.GAAAATTAC 2160 
C CCTGCGGC 

2161 GTCACACTATCAGGTACCTTTGATGAGTGCTATCCAACAT 2200 
CCCATCC CTC 

2 201 ATTTGTATCAAAAAATCGATGAATCAAAATTAAAAGCCTT 2240 
C CGG GCCC 

2241 TACCCGTTATCAATTAAGAGGGTATATCGAAGATAGTCAA 2280 
' C AG CT C C CC 

2281 GACTTAGAAATCTATTTAATTCGCTACAATGCAAAACATG 2320 
CT CCGCAG CGC 

2 321 AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 23 60 
GCG C T CC A 

2361 TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 2400 
T TC C T G T C 

2 4 01 CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 2 4 4 0 
AT G G C 

2441 GTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 2 4 80 
C C C C G C T 

2 481 TCATTTCTCCTT AGACATTGATGT AGGATG7AC AGACTTA 2520 
C GCG T C G 

2 521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA 25 60 
C A C C C C 

2 5 61 CGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCT 2 600 
C C A T C C T 

2601 CGAAGAGAAACCATTAGTAGGAGAAGCGCTAGCTCGTGTG 2640 
G C T T C 

2 641* AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAAAAAT 2 680 
G A G G G G C 

2 681 TGGAATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGA 2720 
C T C CGC 

2 721 ATCTGTAGATGCTTTATTTGTAAACTCTCAATATGATCAA 27 60 
GCG GCG C G 

2 7 61 TTACAAGCGGATACGAA.TATTGCCATGATTCATGCGGCAG 2 800 
G CCCCC CCC 

2801 ATAAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA 2840 
C G C T G CT 
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2841 GCTGTCTGTGATTCCGGGTGTCAATGCGGC7ATTTTTGAA 2880 
T C CT GCTCCCG 

2 881 GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTATATG 2 920 
CTGA CTC TGC 

2.921 ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 2 9 60 
C C CGC CCC 

2 961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 3000 

C CAG T T G C G G 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 304 0 
G T G C G GTG 

3041 GGGAAGCAGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG 3080 
T C G A A A 

3 081 TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 3120 

A A CTC GCT 

3121 TATGGAGAAGGTTGCGTAACCATTCATG AGATCGAG AAC A 3160 
C T G G C C 

3161 ATACAGACGAACTGAAG7TTAGCAACTGCGTAGAAGAGGA 3200 
C C G T CTC C G A 

3201 AATCTATCCAAATAACACGGTAACGTGTAATGATTATAC7 32 4 0 
CC CTTCCCC 

3 241 GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 32 80 
G G G C ■ AGC 

3281 ATCGAGGATATAACGAAGCTCCTTCCGTACCAGCTGATTA 3320 
CA T C T T C 

3 321 TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 3 3 60 
C C G C G G CC CA 

3361 AGAGAGAATCCTTGTGAATTTAACAGAGGGTATAGGGATT 34 00 
CT C CGC TC C 

3401 ACACGCCACTACCAGTTGGTTATGTGACAAAAGAATTAGA 3 4 40 
A T C TCGGCT 

3441 ATACTTCCCAGAAACCGATAAGGTATGGATTGAGATTGGA 3 4 SO 
G TTG CAG C CT 

3481 GAAACGGAAGGAACATTTATCGTGGACAGCGTGGAATTAC 3 5 20 
C G C C GC T 

3521 TCCTTATGGAGGAA 3534 
T G 
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I A T G AC T G C AG A T 7- Jr. T A A T A C G G AA G C A C T A G A T A G C T C T A 4 0 
C "C C C C C C T 

41 CAACAAAAGATGTCATTCAAAAAGGCATTTCCGTAGTAGG 30 
CTG TCGGTC . T G 

81 TGATCTCCTAGGCGTAGTAGGTTTCCCGTTTGGTGGAGCG 120 
AC T G GTATCC C 

121 CTTGTTTCGTTTTATACAAACTTTTTAAATACTATTTGGC 160 
C GAGC C C C C C 

161 C AAGTG AAGAC C C GTGG AAGGCTT TT ATGG AAC AAGT AGA 200 
CG T AAC G T 

201 AGCATTGATGGATCAGAAAATAGCTGATTATGCAAAAAAT 2 40 
TCT GTA CGC 

241 AAAGCTCTTGCAGAGTTACAGGGCCTTCAAAATAATGTCG 280 
GTG ACC GC G 

2 81 AAGATTATGTGAGTGCATTGAGTTCATGGCAAAAAAATCC 320 

G C C TCCAGC G G C 

321 TGTGAGTTCACGAAATCCACATAGCCAGGGGCGGATAAGA 3 60 
T C CA T C A TA C 

3 61 GAGCTGTTTTCTCAAGCAGAAAGTCATTTTCGTAATTCAA 400 

T C C TCC C CA A C 

401 TGCCTTCGTTTGCAATTTCTGGATACGAGGTTCTATTTCT 440 
AGC T C C T T C 

441 AACAAC ATATGC ACAAGC TGCCAAC AC ACATTT ATT TTTA 4 80 
CTC T CCGCC 

481 CTAAAAGACGCTCAAATTTATGGAGAAGAATGGGGATACG 520 
T G C G 

521 AAAAAGAAGATATTGCTGAATTTTATAAAAGACAACTAAA 560 
G GC G C C GC T T 

5 61 ACTTACGCAAGAATATACTGACCATTGTGTCAAATGGTAT 600 
G C C G C C G 

601 AATGTTGGATTAGATAAATTAAGAGGTTCATCTTATGAAT 64 0 
C TC C GC C C T C C G 

641 CTTGGGTAAACTTTAACCGTTATCGCAGAGAGATGACATT 68 0 
G C A A CA G C 
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68 1 AACAGTATTAGAT7 TAATTGCAC" ATTTCCA7 TGTATGAX 7 20 
G T GC C C T C C C C 

721 GTTCGGCTATACCCAAAAGAAGTTAAAACCGAATTAACAA 7 60 
GAAC G G TGCTC 

7 61 GAGACGTTTTAACAGATCCAATTGTCGGAGTCAACAACCT 800 
GC C T C T 

801 TAGGGGCTATGGAAC AAC CTTCTCTAATATAG AAAATTAT 840 
T T AGC C C C 

841 AT7CG AAAACCAC ATC7 ATTTGAC T ATCTG C ATAG AATTC 380 
AG C C T C 

881 AATTTCACACGCGGTTCCAACCAGGATATTATGGAAATGA 920 
C AA T C T C 

921 CTCTTTCAATTATTGGTCCGGTAATTATGTTTCAACTAGA 960 
C C C C C 

961 CCAAGCATAGGATCAAATGATATAATCACATCTCCATTCT 1000 
T T C C C 

1001 ATGGAAATAAATCCAGTGAACCTGTACAAAATTTAGAATT 104 0 
TCG GGCCTG 

1041 TAATGGAGAAAAAGTCTATAGAGCCGTAGCAAATACAAAT 1080 
C C C G C C C 

10 81 CTTGCGGTCTGGCCGTCCGCTGTATATTCAGGTGTTACAA 112 0 
CTG A A T ,C C C ' 

1121 AAGTGG AATTTAGCC AATATAATGATCAAACAGATG AAGC 1160 
G G TG C GC G 

1161 AAGTACACAAACGTACGACTCAAAAAGAAA7GTTGGCGCG 1200 
CCCGT CCTC A 

1201 GTCAGCTGGGATTCTATCGATCAATTGCCTCCAGAAACAA 1240 
TCT C C 

1241 CAGATGAACCTCTAGAAAAGGGATATAGCCATCAACTCAA 1280 
C AT G G CC C. T 

1281 TTATGTAATGTGCTTTTTAATGCAGGGTAGTAGAGGAACA 1320 
C G C G A TCC G C 

1321 ATCCCAGTGTTAACTTGGACACATAAAAGTGTAGACTTTT 13 60 
T G C C GTCC G C 

13 61 TTAACATGATTG ATTCG AAAAAAAT7AC AC AACTTCCGTT 1400 
C C AGC G G C T C 
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1401 AGTAAAGGCATATAAGTTACAATCTGGTGCTTCCGTTGTC 14 40 
G G A C C C G 

1441 GCAGGTCCTAGGTTTACAGGAGGAGATATCATTCAATGCA 1480 
CACT TC CG 

1481 C AGAAAATGGAAGT GC GGC AACTATTT ACGTTAC AC CGG A 1520 
GCCCAT C G T 

1521 TGTGTCGTACTCTCAAAAATATCGAGCTAGAATTCATTAT 15 60 
T G G CA G AC T C 

15 61 GCTTCTACATCTCAGATAACATTTACACTCAGTTTAGACG 1600 
A CAGC C C C C G T 

1601 GGGCACCATTTAATCAATACTATTTCGATAAAACGATAAA 1640 
A CCCGTCTCGCC 

1641 TAAAGGAGACACATTAACGTATAATTCATTTAATTTAGCA 1680 
C T TC C A C AGC C C G 

1681 AGTTTCAGCACACCATTCGAATTATCAGGGAATAACTTAC 1720 
T C C C C TC T 

1721 AAATAGGCGTCACAGGATTAAGTGCTGGAGATAAAGTTTA 17 60 
GC CTCCCC C C 

17 61 TATAGACAAAATTGAATTTATTCC AGTGAAT 17 91 
C C G G C C C 
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1 ATG AAT AAT G T AT T G AAT AGT G G AAG AAC AAC T AT T T 4 0 
GAC C C C CTC 7 C C 

41 GTGATGCGTATAATGTAGTAGCCCATGATCCATTTAGTTT 80 
CCACCCG7C CC 

81 TGAACATAAATCATTAGATACCATCCAAAAAGAATGGATG 120 
C C GAGCC C C T T G G G 

121 GAGTGGAAAAGAACAGATCATAGTTTATATGTAGCTCCTG 160 
A C T T C CTC C C C C A 

161 TAGTCGGAACTGTGTCTAGTTTTTTGCTAAAGAAAGTGGG 200 
G T A CCCCTC GC 

201 GAGTCTTATTGGAAAAAGGATATTGAGTGAATTATGGGGG 240 
CTC C C CTC TCC C C T 

241 ATAATATTTCCTAGTGGTAGTACAAATCTAATGCAAGATA 2 BO 
. C C ATC GTCC T C C 

2 SI TTTTAAGGGAGACAGAACAATTCCTAAATCAAAGACTTAA 320 

CG C GTCCGCTC 

321 TACAGATACCCTTGCTCGTGTAAATGCAGAATTGATAGGG 3 60 
C T TG AAC C T G CT 

3 61 CTC C AAG C GAAT AT AAGGG AGTTT AAT C AAC AAG TAG AT A 400 

ACTCT CCG GC 

4 01 ATTTTTTAAACCCTACTCAAAACCCTGTTCCTTTATCAAT 4 40 

CCGTA GT G CTC 

441 AACTTCTTCGGTTAATACAATGCAGCAATTATTTCTAAAT 4 80 
C CGCT C C C C C 

4 31 AGATTACCCCAGTTCCAGATACAAGGATACCAGTTGTTAT 520 
G T T T C • C CC 

521 TATTACCTTTATTTGCACAGGCAGCCAATATGCATCTTTC 5 60 
TC T AC C T T C CT G 

561 TTTTATTAGAGATGTTATTCTTAATGCAGATGAATGGGGT SCO 
CCACTCGCCCTC A 

601 ATTTCAGCAGCAACATTACGTACGTATCGAGATTACCTGA 64 0 
C T C TC TA G A CA C T 

641 GAAATTATACAAGAGATTATTCTAATTATTGTATAAATAC 680 
GCCTCT CCC CCC 
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6 81 GTATCAAACTGC GTTTAC-AGGGTTAAAC ACC CGT7TAC AC 7 20 

T G C C T AC C T TA GC T 

721 GATATGTTAGAATTTAGAACATATATGTTTTTAAATGTAT 7 50 
C CTGCGCC CCTCG 

7 61 TTGAATATGTATCCATTTGGTCATTGTTTAAATATCAGAG 800 

G C CAG AGTC C C G C 

801 TCTTATGGTATC7TCTGGCGCTAATTTATATGCTAGCGGT 840 
CTG GC AC CCC CTCT C 

8 41 AGTGGACCACAGCAGACACAATCATTTACAGCACAAAACT ' 880 

A T GAGC C T G 

881 GGCCATTTTTATATTCTCTTTTCCAAGTTAATTCGAATTA 920 
C G AGCT G C C C C 

921 TATATTATCTGGTATTAGTGGTACTAGGCTTTCTATTACC 960 
C TC CAG CTC G C A C C A 

SSI TTCCCTAATATTGGTGGTTTACCGGGTAGTACTACAACTC 1000 
T C C AC T A CTCC C 

AGCC T CTC A G C C T 7 

10 41 TTCATCTGGTCTCA7AGGGGCGAC7AATCTCAATCACAAC 1030 
CAGC AT G 7 T A CT G C 

10 81 TTTAATTGCAGCACGGTCCTCCCTCCTTTATCAACACCAT 1120 
C TC C T G A C GAGC G 

1121 TTGTTAGAAGTTGGCTGGATTCAGGTACAGATCGAGAGGG 1160 
G GTCC T CAGC T C A 

1161 CGTTGCTACCTCTACGAATTGGCAGACAGAATCCTTTCAA 12 00 
A A C A C G C 

1201 ACAACTTTAAGTTTAAGGTGTGGTGCTTTTTCAGCCCGTG 12 4 0 
CCTCCTC A CTA 

12 41 G AAATTC AAACT ATTTC C C AG ATT AT7TT AT C C GT AAT AT 1280 
G CT CCCTAGC 

12 81 TTCTGGGGTTCCTTTAGTTATTAGAAACGAAGATCTAACA 132 0 

C T CCCCGT CCC 

13 21 AGACCGTTACACTATAACCAAATAAGAAATATAGAAAGTC 13 60 

C T AC T T C G T G C C GTC 

13 61 CTTCGGGAACACCTGGTGGAGCACGGGCCTATTTGGTATC 1400 
ACTTAAT AATCCCG 
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14 01 TGTGCATAACAGAAAAAJ^TAATATCTATGCCGCTAATGAA 14 4 0 
C GGCC CTCCG 

14 41 AATGGTACTATGATGCATTTGGCGCCAGAAGATTATACAG 14 80 
CC TCCTA . C T 

1481 G ATTTAC T ATATC GCC AAT AC ATG CC AC TC AAGTGAAT AA 1520 
CCCT C TC C 

1521 TCAAACTCGAACATTTATTTCTGAAAAATTTGGAAATCAA 15 60 
GACCCCC GC 

1561 GGTGATTCCTTAAGATTTGAACAAAGCAACACGACAGCTC 1600 
C GGCGTC T C A 

1601 GTTATACGCTTAGAGGGAATGGAAATAGTTACAATCTTTA 164 0 
GCTTG C CC C 

1641 TTTAAGAGTATCTTCAATAGGAAATTCAACTATTCGAGTT 1 6.8 0 
C G TAGC CTTCCCCT 

1681 ACTATAAACGGTAGAGTTTATACTGTTTCAAATGTTAATA 1720 
CC ACT CACT GC 

1721 CCACTACAAATAACGATGGAGTTAATGATAATGGAGCTCG 17 60 
TAGCT C CCC CA 

1761 TTTTTCAGATATTAATATCGGTAATATAGTAGCAAGTGAT 1800 
A CAGC CCCTCCCG CT'C C 

1301 AATACTAATGTAACGCTAGATATAAATGTGACATTAAACT 18 40 
C CTTTGCC CCCT 

1841 CCGGTACTCCATTTGATCTCATGAATATTATGTTTGTGCC 18"80 
T A C C 

1881 AACTAATCTTCCACCACTTTAT 1902 
C C T T G C 
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1 ATGGAGGAAAATAATCAAAATCAATGCATACCTTACAATT 4 0 
G C C C T A C 

41 GTTTAAGTAATCCTGAAGAAGTACT TTTGGATGGAGAACG 80 
C G C A G T GC T 

81 GATATCAACTGGTAATTCATCAATTGATATTTCTCTGTCA 120 
CT C CTCCCCCT C 

121 CTTGTTCAGTTTCTGGTATCTAACTTTGTACCAGGGGGAG ISO 
T G C CAGC C G T T 

161 GATTTTTAGTTGGATTAATAGATTTTGTATGGGGAATAGT 200 
GCCTCC TCCC TC 

201 TGGCCCTTCTCAATGGGATGCATTTCTAGTACAAATTGAA 2 40 
T A C G G G 

2 41 CAATTAATTAATGAAAGAATAGCTGAATTTGCTAGGAATG 280 

GGCCGGC GCC C 

281 CTGCTATTGCTAATTTAGAAGGATTAGGAAACAATTTCAA 3 20 
CC CG GCTC 

321 TATATATGTGGAAGCATTTAAAGAATGGGAAGAAGATCCT 3 60 
CC GCC G GC 

3 61 AATAATCCAGAAACCAGGACCAGAGTAATTGATCGCTTTC 4 00 

C G CCTGGCCAACA 

401 GTATAC TTGATGGGCTACTTGAAAGGGACATTCCTTCGTT 4 40 
A CT GCCCTGGATCAC 

441 TCGAATTTCTGGATTTGAAGTACCCCTTTTATCCGTTTAT 4 BO 
CA C CC TTCG GC 

481 GCTCAAGCGGCCAATCTGCATCTAGCTAT ATTAAGAGATT 52 0 
AT T C C CC TC CA 

521 CTGTAATTTTTGGAGAAAGATGGGGATTGACAACGATAAA 5 60 
GCC G G CT C 

5 61 TGTCAATGAAAACTATAATAGACT AATTAGGCATATTGAT 600 
C GTCC TC C C 

601 GAATATGCTGATCACTGTGCAAATACGTATAATCGGGGAT 640 
G C C C TCCCCTC 

641 TAAATAATTTACCGAAATCTACGTATCAAGATTGGATAAC 5 80 
GCCCTG T T 

681 ATATAATCGATTACGGAGAGACTTAACATTGACTGTATTA 7 20 
C C CA G GA G CC C A T G 
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7 21 GATATCGCCGCTTTCTTTCCAAACTATGAC AATAGGAGAT 7 60 
C T A C G C 

7 61 ATCCAATTCAGCCAGTTGGTCAACTAACAAGGGAAGTTTA 800 
CTCA G TCA C 

801 TACGGACCCATTAATTAATTTTAATCCACAGTTACAGTCT 840 
T CT CCCT G AAG 

841 GTAGCTCAATTACCTACTTTTAACGTTATGGAGAGCAGCC 380 
CCCTCAC C TC 

881 GAATTAGAAATCCTCATTtATTTGATATATTGAATAATCT 920 
TCGCACG CC CC 

921 TACAATCTTTACGGATTGGTTTAGTGTTGGACGCAATTTT 960 
T CC. CC GTCC 

'9 61 TATTGGGGAGGACATCGAGTAATATCTAGCCTTATAGGAG 1000 
T CA G C C CTCT ' T 

1001 GTGGTAACATAACATCTCCTATAT ATGGAAGAGAGGCGAA 104 0 
G T C C C T A 

1041 CCAGGAGCCTCCAAGATCCTTTACTTTTAATGGACCGGTA 1080 
A C TAGT G C C C T A C 

1081 TTTAGGACTTTATCAAATCCTACTTTACGATTATTACAGC 1120 
CACGTC CGA GCC. 

1121 AACCTTGGCCAGCGCCACCATTTAATTTACGTGGTGTTGA 1160 
T T C CC TA A 

1161 AGGAGTAGAATTTTCTACACCTACAAATAGCTTTACGTAT 1200 
G C T G C T C CTC C T C 

1201 CGAGGAAGAGGTACGGTTGATTCTTTAACTGAATTACCGC 12 4 0 
A T AC CGCCCA 

1241 CTGAGGATAATAGTGTGCCACCTCGCGAAGGATATAGTCA 1280 
A C C CA G C CTCC 

1281 TCGTTTATGTCATGCAACTTTTGTTCAAAGATCTGGAACA 13 20 
CAGGCC CCGGCTC T 

1321 CCTTTTTTAACAACTGGTGTAGTATTTTCTTGGACCGATC 1360 
ACCCTAATGCA T 

13 61 GTAGTGCAACTCTTACAAATACAATTGATCCAGAGAGAAT 1400 
T C T C C G 
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14 01 TAATCAAATACCTTTAGTGAAAGGATTTAGAGTTTGGGGG 14 4 0 

C CAGCGTCCTG A 

1441 GGCACCTCTGTCATTACAGGACCAGGATTTACAGGAGGGG 148 0 

AT C C C T 

14S1 ATATCCTTCGAAGAAATACCTTTGGTGATTTTGTATCTCT 1520 
T A C T C C GAGC 

1521 ACAAGTCAATATTAATTCACCAATTACCCAAAGATACCGT 15S0 
C TCCCT - T T 

15 61 TTAAGATTTCGTTACGCTTCCAGTAGGGATGCACGAGTTA 1600 

C C G A TTCCC T C TA C 

1601 TAGTATTAACAGGAGCGGCATCCACAGGAGTGGGAGGCCA 164 0 
CGCCCCATTCTCTA 

1641 AGTTAGTGTAAATATGCCTCTTCAGAAAACTATGGAAATA 1680 
CTCC G C AC G G C 

1681 GGGGAGAACTTAACATCTAGAACATTTAGAXATACCGATT 17 20 
C G CGCC C C 

1721 TTAGTAATCCTTTTTCATTTAGAGCTAATCCAGATATAAT 17 6 0 
CTC C CAGT CC T C C T C C 

17 61 TGGGATAAGTGAACAACCTCTATTTGGTGCAGGTTCTATT 18 00 
CTC C AT AGC C 

1801 AGTAGCGGTGAACTTTATATAGATAAAATTGAAATTATTC 18 40 
TCATCT C TGCTCG GC 

1841 TAGCAGATGCAACATTTGAAGCAGAATCTGATTTAGAAAG 18 8C 
TCCTCCCGTG ACA CC T G 

1881 AGCACAAAAGGCGGTGAATGCCCTGTTTACTTCTTCCAAT 1920 
C f Q T C C C - CA 

19-21 CAAATCGGGTTAAAAACCGATGTGACGGATTATCATATTG 1960 
GC T C G TA C T T C C 

1961 ATCAAGTATCCAATTTAGTGGATTGTTTATCAGATGAATT 2000 
C G C G CACC ACC TAGC G 

2 001 TTGTCTGGATGAAAAGCGAGAATTGTCCGAGAAAGTCAAA 20 40 
CCCCG TCC T 

2041 CATGCGAAGCGACTCAGTGATGAGCGGAATTTACTTCAAG 2080 
CC T CCA CCTG 

2081 ATCCAAACTTCAGAGGGATCAATAGACAACCAGACCGTGG 2120 
CT C A AC C G G A 
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2121 CTGGAGAGGAAGTACAGATATTACCATCC AAGGAGGAGAT 2160 
TGT CCGGC CC 

2161 GACGTATTCAAAGAGAATTACGTCACACTACCGGGTACCG 2200 
TG G C CCTCATT 

2201 TTGATGAGTGCTATCCAACGTATTTATATCAGAAAATAGA 22 4 0 
CC CTCCGC G C 

2241 TGAGTCGAAATTAAAAGC7TATACCCGTTATGAATTAAGA 2230 
C CC CTC AG CCT 

2281 GGGTATATCGAAGATAGTCAAGACTTAGAAATCTATTTGA 2320 
CC CC CT CC 

2321 TCCGTTACAATGCAAAACACGAAATAGTAAATGTGCCAGG 2360 
AG CG GCCG C 

23 61 CACGGGTTCC7TATGGCCGCTTTCAGCCCAAATGCCAATC 2400 
T T C C A T TC7 C T 

2 4 01 GGAAAGTGTGGAGAACCGAATCGATGCGCGCCACACCTTG 24 4 0 
G G T CA T 

2 441 AATGGAATCCTGATCTAGATTGTTCCTGCAGAGACGGGGA 2480 
G CTGCC G T C 

2 4 31 AAAATGTGCACATCATTCCCATCATTTCACCTTGGATATT 252 0 
GG CC T CT .CC 

2 521 GATGTTGGATGTAC AGACTTAAATGAGGACTTAGGTGTAT 25 60 
G TCG CCAC 

2 5 61 GGGTGATATTC AAGATTAAGACGC AAGATGGCCATGCAAG 2 600 
C C C C C A C 

2 601 ACTAGGGAATCTAGAGTTTCTCGAAGAGAAACCATTATTA 2 64 0 
T C C T GG C 

2 641 GGGGAAGCACTAGCTCGTGTGAAAAGAGCGGAGAAGAAGT 2 680 
T T C G A 

2 681 GGAGAGACAAACGAGAGAAACTGCAGTTGGAAACAAATAT 272 0 
G T CG A G T C 

2 721 TGTTTATAAAGAGGCAAAAGAATCTGTAGATGCTTTATTT 27 60 
C CG C GCG GC 

27 61 GTAAACTCTCAATATGATAGATTACAAGTGGATACGAACA 2800 
G C CAG G CC C C 

2 801 TCGCCATGATTCATGCGGCAGATAAACGCG7TCATAGAAT 2S4C 
CCC C TGCC 
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2841 CCGGGAAGCGTATCTGCCAGAGTTGTCTGTGATTCCAGG7 2880 
TTGTCT T C CT 

2881 GT C AAT GCGGC C ATTTTC G AAG AATT AG AGGG ACGT ATTT 2920 
GCT C GCT C 

2 921 TTACAGCGTAT7CCTTATATGATGCGAGAAATGTCATTAA 2 960 

CATC GC C C C 

2961 AAATGGCGATTTCAATAATGGCTTATTATGCTGGAACGTG 3000 
G C T C C C CAGC T 

3t0l AAAGGTCATGTAGATGTAGAAGAGCAAAACAACCACCGTT 3040 
GCGGAG TG 

3041 CGGTCCTTGTTATCCCAGAATGGGAGGCAGAAGTGTCACA 3080 
C GGGTG AT C 

3081 AGAGGTTCGTGTCTGTCCAGGTCGTGGCTATATCCTTCGT 3120 
ft A A A C T C 

3121 GTCACAGCATATAAAGAGGGATATGGAGAGGGCTGCGTAA 3160 
GCTCG CT T G 

3161 CGATCCATGAGATCGAAGACAATACAGACGAACTGAAATT 3200 
. C C GA C C G T G 

3 201 CAGCAACTGTGTAGAAGAGGAAGTATATCCAAACAACACA 3240 

TC CC.GAAC C C 

32 41 GTAACGTGTAATAATTATACTGGGACTCAAGAAGAATATG 3 280 
TTCCGCC TAG GC 

3281 AGGGTACGTACACTTCTCGTAATCAAGGATATGACGAAGC 3 320 
GA G C AGC CAG T CA 

3 321 CTATGGTAATAACCCTTCCGTACCAGCTGATTACGCTTCA 3 3 60 
TCC TCXXXXXXXXXXXX T T C T C C 

3361 GTCTATGAAGAAAAATCGTATACAGATGGACGAAGAGAGA 3 400 
GCGG CC CA C T 

3401 ATCCTTGTGAATCTAACAGAGGCTATGGGGATTACACACC 3 440 
C C G TC T CA C 

3441 ACTACCGGCTGGTTATGTAACAAAGGATTTAGAGTACTTC 3480 
TAT C TC GCT T 

3481 CCAGAGACCGATAAGGTATGGATTGAGATCGGAGAAACAG 3 5 20 
T CAGC T C 

3 521 AAGGAACATTCATCGTGG ATAGCGTGGAATTACTCCTTAT 3 5 60 
G C C GC T T G 
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1 AGATC 7 AGAGGTAATTG TT ATGAG? ACTGTCGTGGTT AAG 4 0 
GATC 

41 GGAAACGTCAACGG7GGTGTACAACAACCTAGAAGGAGGA 80 
G T A 

81 GAAGGCAATCCCTTCGCAGGAGGGCTAACAGAGTACAGCC 120 
T A T 

121 AGTGGTTATGGTCACTGCTCCTGGCGAACCCAGGAGGAGG 150 
GC A A A 

161 AGACGCAGAAGAGGAGGCAATCGCAGGTCAAGAAGAACTG 200 
AG T A 

201 GAGTTCCCAGGGGAAGGGGCTCAAGCGAGACATTCGTGTT 240 
A A T 

2 41 TACAAAGGACAACCTCGTGGGCAACTCCCAAGGAAGTTTC 230 

281 ACCTTCGGACCAAGTGTATCAGACTGTCCAGCATTCAAGG 320 
T 

321 ATGGAATACTCAAGGCCTACCATGAGTACAAGATCACAAG 350 
T 

3 61 TATCCTTCTTCAGTTCGTCAGCGAGGCCTCTTCCACCTCA 400 

T G T 

401 CCAGGATCCATCGCTTATGAGTTGGACCCACATTGCAAAG - 4 40 
C AT 

441 TATCATCCCTCCAGTCCTACGTCAACAAGTTCCAAATCAC 4 80 
T 

4 81 AAAGGGAGGAGCTAAGACCTATCAAGCTAGGATGATCAAC 520 

T T C T 

521 GGAGTAGAATGGCACGATTCATCTGAGGATCAGTGCAGGA 5 50 
T T A 

5 61 T ACTTTG G AAAGG AAG TGG AAAATC TTC AG ACCCAGC AGG 600 

C A G T T 

601 ATCTTTCAGAGTCACCATCAGAGTGGCTCTTCAAAACCCC 640 
T 1 A 

641 AAGTAATAGACTCCGGATCAGAGCCTGGTCCAAGCCCACA 680 
A T 
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631 ACCAACACCCACTCCAACTCCCCAAAAGCATGAGCGATTT 720 
721 ATTGCTTACGTCGGCATACCTATGCTGACCATTCAAGAAT 7 60 
7 61 TC 7 62 
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